Thursday, March 13, 2008

TSQL and Excel - Data Exchanges

We are all aware of the applicability of "BCP" for transporting data between external files (text files, Excel files, etc) and database tables, and "Bulk Insert" for sending data one way from external files into database tables. I love BCP. Sometimes, situations arise where we need data from tables to be sent to well formatted Excel spreadsheets and vice-versa. With BCP, one could not send in the Column Headers into the Spreadsheet (it can be done with queryout option but in a round about way) . We always need simple solutions, don't we?


Let's complicate the requirement further. We have to write today's Top three stocks from the database on to a spreadsheet, i.e. the top three gainers in the Stock Market today. We have to also write the top three losing stocks in the Stock Market today into the same Excel spreadsheet but in a different tab. This single Excel spreadsheet with the two tabs, one for gainers and one for losers needs to be e-mailed to some of the investors daily. We would put the e-mailing issue aside as it is quite simple using SQL Server 2005 Database Mail or SQL Server 2000 SQL Mail.



The second requirement is that the data in both the Excel spreadsheet tabs needs to be overwritten with fresh data daily. The source of all data is just one database table which is fed daily. There are two more tabs in the spreadsheet that will have the consolidated gainers and losers respectively for all days as the days progress.



It might have been so simple if this overwritting on two tabs (Gainers and Losers) and appending onto two other tabs (AllGainers and AllLosers) was not required. It would have also been so simple if we could have had four separate spreadsheets instead of four tabs in one single spreadsheet. Then we could have simply used the simple BCP and added a little bit of complication in the bcp command to include the Column Headers in there. But that is not the case as we need to keep the solution simple. Some people say, use CLR and do it row by row. Some say, use VC++.


The issue here is to find a simple solution and not complicate the solution further. We don't want assemblies and DLL's to be prepared, compiled, registered and then modified whenever something needs to be changed. So CLR, VC++ and the like are not an option. We are talking about efficient programming here. And since we are talking about programming and not about packages, DTS and SSIS are also not an option. We need minimum maintenance too. What do we do?



We resort to our contemporary T-SQL database programming language. T-SQL is simple, neat, prolific, dynamic and very flexible. Why not explore OPENDATASOURCE and OPENROWSET? Let's go about solving the problem now.



So let me prepare the data first. Let's create a table called Table_Stock_Value and populate the same. This article is just about concepts. So please don't bother about constraints, indexes and fillfactors right now. All symbols in the data are hypothetical.




if exists (select 1 from sys.tables
where name = 'Table_Stock_Value'
and type = 'U')
drop table Table_Stock_Value
go
create table Table_Stock_Value
(SerialNo int identity (1,1) not null
,NASDAQ_SYMBOL varchar(4) not null
,Company_Name varchar(25) not null
,Change decimal(5,2)
,Date_of_Change datetime not null default getdate())
go
insert into Table_Stock_Value (NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change)

--TOP GAINERS
select 'ABCD', 'ABCD Inc', 2.50, getdate()-3
union all
select 'EFGH', 'EFGH Inc', 2.00, getdate()-3
union all
select 'IJKL', 'IJKL Inc', 1.50, getdate()-3
union all
select 'IJKL', 'IJKL Inc', 1.75, getdate()-2
union all
select 'MNOP', 'MNOP Inc', 1.60, getdate()-2
union all
select 'ABCD', 'ABCD Inc', 1.00, getdate()-2
union all
select 'QRST', 'QRST Inc', 3.50, getdate()-1
union all
select 'UVWX', 'UVWX Inc', 2.50, getdate()-1
union all
select 'ABCD', 'ABCD Inc', 1.25, getdate()-1
union all
select 'IJKL', 'IJKL Inc', 4.95, getdate()
union all
select 'ABCD', 'ABCD Inc', 2.50, getdate()
union all
select 'MNOP', 'MNOP Inc', 1.25, getdate()
union all
select 'ABCD', 'ABCD Inc', 1.90, getdate()+1
union all
select 'EFGH', 'EFGH Inc', 1.20, getdate()+1
union all
select 'QRST', 'QRST Inc', 1.10, getdate()+1
union all
select 'ABCD', 'ABCD Inc', 1.10, getdate()+2
union all
select 'UVWX', 'UVWX Inc', 1.00, getdate()+2
union all
select 'QRST', 'QRST Inc', 0.95, getdate()+2
union all
select 'ABCD', 'ABCD Inc', 1.50, getdate()+3
union all
select 'EFGH', 'EFGH Inc', 1.25, getdate()+3
union all
select 'IJKL', 'IJKL Inc', 1.00, getdate()+3

UNION ALL
--TOP LOSERS
select 'BBBB', 'BBBB Inc', -1.50, getdate()-3
union all
select 'CCCC', 'CCCC Inc', -1.20, getdate()-3
union all
select 'DDDD', 'DDDD Inc', -1.10, getdate()-3
union all
select 'EEEE', 'EEEE Inc', -1.75, getdate()-2
union all
select 'FFFF', 'FFFF Inc', -1.60, getdate()-2
union all
select 'YYYY', 'YYYY Inc', -1.00, getdate()-2
union all
select 'IJKL', 'IJKL Inc', -0.70, getdate()-1
union all
select 'HHHH', 'HHHH Inc', -0.62, getdate()-1
union all
select 'CCCC', 'CCCC Inc', -0.25, getdate()-1
union all
select 'UUUU', 'UUUU Inc', -4.95, getdate()
union all
select 'LLLL', 'LLLL Inc', -2.50, getdate()
union all
select 'AAAA', 'AAAA Inc', -1.25, getdate()
union all
select 'UVWX', 'UVWX Inc', -1.90, getdate()+1
union all
select 'FFFF', 'FFFF Inc', -1.20, getdate()+1
union all
select 'CCCC', 'CCCC Inc', -1.10, getdate()+1
union all
select 'CCCC', 'CCCC Inc', -0.80, getdate()+2
union all
select 'BBBB', 'BBBB Inc', -0.75, getdate()+2
union all
select 'EEEE', 'EEEE Inc', -0.65, getdate()+2
union all
select 'EEEE', 'EEEE Inc', -1.50, getdate()+3
union all
select 'YYYY', 'YYYY Inc', -1.25, getdate()+3
union all
select 'QRST', 'QRST Inc', -1.00, getdate()+3
go
create clustered index ci_date on Table_Stock_Value (Date_of_Change) with fillfactor = 70
go
select * from Table_Stock_Value
go



Next for one time only, we need to prepare an Excel spreadsheet with four tabs in it. The names of the tabs are, Gainers, Losers, AllGainers, and AllLosers. And each tab can have the column names same or different than in the table, Table_Stock_Value created above. For simplicity, let's keep the column names same in the tabs. I have created and kept the Excel spreadsheet in the folder, C:\SQLXL. I have named the Excel spreadsheet as Stock_Overview.xls.


Please remember to close the Excel sheet before the initial compilation of the SP and before each instance of execution of the SP.


Now, let's set out to do what we need to do i.e. writing into the Excel spreadsheet's different tabs as per the requirements stated above. I have written the SP as follows. Please note that the SQL Server advanced configuration option, 'Ad Hoc Distributed Queries' should be set to 1 to read from and write to the Excel Spreadsheet from within a database Stored Procedure:






set ansi_warnings on
go
set ansi_nulls on
Go
execute sp_configure 'show advanced options', 1
go
reconfigure
go
execute sp_configure 'Ad Hoc Distributed Queries', 1
go
reconfigure
go

if exists (select 1 from sys.procedures
where type = 'P'
and [NAME] = 'DBP_Prepare_Daily_Stock_Data')
drop procedure [DBP_Prepare_Daily_Stock_Data]
GO
create procedure DBP_Prepare_Daily_Stock_Data
@i_date datetime = null
as
begin
declare @l_File_Exists int
select @i_date = isnull(@i_date, getdate())
execute master.dbo.XP_fileexist 'C:\SQLXL\Stock_Overview.xls', @l_File_Exists output
if (@l_File_Exists = 1)
begin
--Gainers have to be overwritten everyday.
UPDATE OPENDATASOURCE ('Microsoft.Jet.OleDB.4.0',
'Data Source="C:\SQLXL\Stock_Overview.xls";
extended Properties=Excel 8.0')...Gainers$
SET SerialNo = ''
,NASDAQ_SYMBOL = ''
,Company_Name = ''
,Change = ''
,Date_of_Change = ''
WHERE SerialNo is not null
--Losers have to be overwritten everyday.
UPDATE OPENDATASOURCE ('Microsoft.Jet.OleDB.4.0',
'Data Source="C:\SQLXL\Stock_Overview.xls";
extended Properties=Excel 8.0')...Losers$
SET SerialNo = ''
,NASDAQ_SYMBOL = ''
,Company_Name = ''
,Change = ''
,Date_of_Change = ''
WHERE SerialNo is not null
--UPDATE/Overwrite the Gainers tab with top 3 gainers' data for the given day, @i_date.
insert into OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\SQLXL\Stock_Overview.xls;',
'SELECT SerialNo, NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change FROM [Gainers$A1:E1]')
select top 3 SerialNo, NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change
from Table_Stock_Value with (nolock)
where convert(varchar, Date_of_Change, 112) = convert(varchar, @i_date, 112)
and Change > 0.00
order by Change desc
--UPDATE/Overwrite the Losers tab with top 3 losers' data for the given day, @i_date.
insert into OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\SQLXL\Stock_Overview.xls;',
'SELECT SerialNo, NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change FROM [Losers$A1:E1]')
select top 3 SerialNo, NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change
from Table_Stock_Value with (nolock)
where convert(varchar, Date_of_Change, 112) = convert(varchar, @i_date, 112)
and Change < 0.00
order by Change desc
--UPDATE/APPEND the AllGainers tab with top 3 gainers' data for the given day, @i_date.
insert into OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\SQLXL\Stock_Overview.xls;',
'SELECT SerialNo, NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change FROM [AllGainers$A1:E1]')
select top 3 SerialNo, NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change
from Table_Stock_Value with (nolock)
where convert(varchar, Date_of_Change, 112) = convert(varchar, @i_date, 112)
and Change > 0.00
order by Change desc
insert into OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;Database=C:\SQLXL\Stock_Overview.xls;',
'SELECT SerialNo, NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change FROM [AllLosers$A1:E1]')
select top 3 SerialNo, NASDAQ_SYMBOL, Company_Name, Change, Date_of_Change
from Table_Stock_Value with (nolock)
where convert(varchar, Date_of_Change, 112) = convert(varchar, @i_date, 112)
and Change < 0.00
order by Change desc
PRINT 'RESULTS HAVE BEEN WRITTEN TO C:\SQLXL\Stock_Overview.xls'
end
else
Begin
Print 'The file Stock_Overview.xls should exist in the correct path, C:\SQLXL.'
PRINT 'RESULTS COULD NOT BE WRITTEN TO C:\SQLXL\Stock_Overview.xls. THE FILE IS OPEN OR DOES NOT EXIST.'
End
end
GO

The SP being ready now, let's check if the all the tabs in the Excel spreadsheet,
C:\SQLXL\Stock_Overview.xls are empty:

select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [Gainers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [Losers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [AllGainers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [AllLosers$]')
go

They are empty.

Let's execute the SP for the earliest date in the table Table_Stock_Value, i.e. getdate()-3:


declare @i_date datetime
select @i_date = getdate()-3
exec DBP_Prepare_Daily_Stock_Data @i_date
go

Now open the spreadsheet, C:\SQLXL\Stock_Overview.xls and check it out. The data has been written to the respective tabs for the given input date to the SP, DBP_Prepare_Daily_Stock_Data.

You can also execute the following T-SQL code from your SQL Server database to check the same:


select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [Gainers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [Losers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [AllGainers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [AllLosers$]')
go


The data in the spreadsheet and the result sets above would match.


Let's execute the SP for the next date in the table Table_Stock_Value, i.e. getdate()-2:

declare @i_date datetime
select @i_date = getdate()-3
exec DBP_Prepare_Daily_Stock_Data @i_date
go

Now open the spreadsheet, C:\SQLXL\Stock_Overview.xls and check it out. The data has been written to the respective tabs for the given input date to the SP, DBP_Prepare_Daily_Stock_Data. Please close the spreadsheet after reading its contents.


To verfiy the results using T-SQL itself without opening the spreadsheet, you can use the following code too and verify the results:


select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [Gainers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [Losers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [AllGainers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [AllLosers$]')
go

Note that we have met the requirement.

Let's do it for getdate()-1 and getdate() as well.

declare @i_date datetime
select @i_date = getdate()-1
exec DBP_Prepare_Daily_Stock_Data @i_date

Now open the Excel Spreadsheet. You would find the Gainer and Losers tabs overwritten with the top 3 gainers and top 3 losers respectively for the date, getdate()-1. You would also notice that the same data has been appended respectively to AllGainers and AllLosers tabs within the same spreadsheet. So, the Gainers and Losers tabs always have three rows each and these rows pertain specifically to the given input date to the SP. The AllGainers and AllLosers tabs have their number of rows growing with each execution of the SP as new rows get appended behind the older rows thus building up a historical data for later analysis. And that's all we want.

Let's execute the SP for @i_date = getdate().

declare @i_date datetime
select @i_date = getdate()
exec DBP_Prepare_Daily_Stock_Data @i_date
go

Now open the Excel Spreadsheet. You would find the Gainer and Losers tabs overwritten with the top 3 gainers and top 3 losers respectively for the given input date, getdate(). You would also notice that the same data has been appended respectively to AllGainers and AllLosers tabs within the same spreadsheet. So, the Gainers and Losers tabs always have three rows each and these rows pertain specifically to the given input date for top three gainers and losers respectively for the respective tabs. The AllGainers and AllLosers tabs have their number of rows growing with each execution as new rows get appended behind the older rows thus building up a historical data for later analysis. We are done with verifying that we have satisfied the requirements that we initially started out with.

You can checkout the contents of the spreasheet by just opening it from its location. Alternatively, you can use the following code. Please close the spreadsheet after physically reading its contents.:

select * FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [Gainers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [Losers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [AllGainers$]')
go
select *
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0',
'Excel 8.0;DATABASE=C:\SQLXL\Stock_Overview.xls', 'Select * from [AllLosers$]')
go

Please remember to close the Excel sheet before each execution of the SP.

Now let me throw a little bit of explanation behind overwriting and appending the Excel spreadsheet tabs:

First of all, in order to read from and write to an external data source like the Excel spreadsheet in this case while not using the BCP or BULK INSERT commands, the advanced SQL Server configuration option 'Ad Hoc Distributed Queries' must be set to 1. This way, writes to and reads from the Excel spreadsheet get enabled. This's what we did first.

Next, we UPDATE any existing records in the Gainers and Losers tabs with an empty string, ''. This is equivalent to DELETEing all existing records in both these tabs. Even then, the empty characher string remains in the existing cells although it is not visible.

The next step is to overwrite these existing cells in the Gainers and Losers tabs. For this purpose, we use the INSERT statement. Please note that we use the starting and ending cells of the first row in the Gainers and Losers tab in the OPENDATASOURCE statement (A1:E1). This will enable overwriting the existing cells. The smart thing here is that any empty strings '' are no more visible even if they exist somwhere else in the spreadsheet.

For the tabs AllGainers and AllLosers, since we don't want to overwrite existing data but instead want to append new data to any existing data, we omit mentioning the starting row A1:E1 in the OPENDATASOURCE function of the INSERT statement. That ensures that new rows are always appended to the existing rows in these two tabs.

You can use the spreadsheet and its tabs as if they were individual tables, for reading and performing calculations. You can apply all types of T-SQL constructs on that data to make aggregations, etc. Please note the sample T-SQL code above that we used to check the result sets in the tabs of the spreadsheet after executing the SP.

One important point to note is that before compiling the SP, the spreadsheet with the appropriate tabs in it should exist in the correct path mentioned in the SP. Otherwise the SP will not get compiled and will instead throw an error.

Another important point is that during compilation and execution of the SP, the spreadsheet itself should not be physically openend from the Windows Explorer.

Ofcourse, sufficient scope exists for making improvisations to the SP to calculate the top stock for the month or for generating some special reports, etc. I am limited myself to just explaining the manipulation of an Excel spreadseet and its tabs using T-SQL. And the example has been designed to just show that. Using Database Mail to e-mail the spreadsheet is outside the scope of this article.

I hope you like the technique of writing to and manipulating spreadsheets and their tabs using simple T-SQL. Please let me know of your comments and thoughts.

Monday, March 10, 2008

Simplifying Code: Updating Variables

A good programming practice means being able to use every available option one can to reduce code and at the same time increase speed of execution. A merit of blogging is that such methods can be explained but demontration of using the sample code as it is on large databases is not possible because of the obvious limitations on how much can be written inside one article. That fact however does not in any way takes away the applicability of such methods on large databases. In today's article, we will see how an unrelated construct/statement can help in reducing code and speeding up execution.

First let us build the base table, Order_Master:


create table Order_Master
(Order_ID smallint not null
,Begin_Date datetime not null
,End_Date datetime not null
,Delivery_Location varchar(50) not null)
go

Now, let's put in some data into the table, Order_Master:


Insert into Order_Master (Order_ID, Begin_Date, End_Date, Delivery_Location)
select 1, getdate()-100, getdate()+100, 'New York'
union all
select 2, getdate()-50, getdate()+50, 'Washington D.C'
union all
select 3, getdate()-25, getdate()+25, 'Princeton'
union all
select 4, getdate()-12, getdate()+12, 'Dallas'
union all
select 5, getdate()-6, getdate()+6, 'Los Angeles'
go

Let's read from the table, Order_Master:


select * from Order_Master
go


Order_ID Begin_Date End_Date Delivery_Location
-------- ----------------------- ----------------------- --------------------
1 2007-12-01 14:06:04.087 2008-06-18 14:06:04.087 New York
2 2008-01-20 14:06:04.087 2008-04-29 14:06:04.087 Washington D.C
3 2008-02-14 14:06:04.087 2008-04-04 14:06:04.087 Princeton
4 2008-02-27 14:06:04.087 2008-03-22 14:06:04.087 Dallas
5 2008-03-04 14:06:04.087 2008-03-16 14:06:04.087 Los Angeles

Now let's come to the scenario.

1. For a given Order_ID and for a given input date, we need to find out what the begin date and end date for that Order_ID are.

2. We also need to find out whether the given input date falls within the begin date and end date for that order ID.

3. The output needs to be the Begin Date, the end date and a return value which would be 1 if the given input date falls within the begin date and end date for that order ID or 0 if it doesn't.


Given these requirements, a normal Stored Procedure would be like the following:


if exists (select 1 from sys.procedures
where type = 'P'
and user_name(schema_id) = 'dbo'
and [NAME] = 'DBP_ValidateOrderDates')
drop procedure [dbo].[DBP_ValidateOrderDates]
GO

CREATE PROCEDURE [dbo].[DBP_ValidateOrderDates]
@i_OrderNo varchar(6),
@i_ValidationDate datetime,
@o_Return bit = 0 output,
@o_BeginDate datetime = Null output,
@o_EndDate datetime = Null output
AS
BEGIN
SET NOCOUNT ON;
Select @o_BeginDate = Begin_Date
,@o_EndDate = End_Date
from dbo.Order_Master
where Order_ID = @i_OrderNo
IF @@rowcount = 0
Begin
Set @o_Return = 0
End
Else
If (@i_ValidationDate <= @o_EndDate and @i_ValidationDate >= @o_BeginDate)
Begin
set @o_Return = 1
End
Else
Begin
set @o_Return = 0
End
Set nocount off
END
Go

Now let's compile and then execute this SP for Order_ID = 1 and input date as today's date:

--Declaring the variables:
declare @i_OrderNo varchar(6),
@i_ValidationDate datetime,
@o_Return bit,
@o_BeginDate datetime,
@o_EndDate datetime

--Assigning values to input variables/parameters:
select @i_OrderNo = 1
,@i_ValidationDate = getdate()
,@o_Return = null

--Executing the SP with the given input parameters:
execute [dbo].[DBP_ValidateOrderDates]
@i_OrderNo
,@i_ValidationDate
,@o_Return output
,@o_BeginDate output
,@o_EndDate output

--Capturing the output:
select
@o_Return 'Return Value', @o_BeginDate 'Begin Date', @o_EndDate 'End Date'

Here is the result set as expected. We got the begin and end dates for the given Order_ID (1) and the return value is 1 because the input date falls between the begin and end dates for that Order_ID:

Return Value Begin Date End Date
------------ ----------------------- -----------------------
1 2007-12-01 14:06:04.087 2008-06-18 14:06:04.087


Now let's execute this SP for the same Order_ID (1) but with the input date not falling within the begin and end dates for that Order_ID.

--Declaring the variables:
declare @i_OrderNo varchar(6),
@i_ValidationDate datetime,
@o_Return bit,
@o_BeginDate datetime,
@o_EndDate datetime

--Assigning values to input variables/parameters:
select @i_OrderNo = 1
,@i_ValidationDate = getdate()-200
,@o_Return = null

--Executing the SP with the given input parameters:
execute [dbo].[DBP_ValidateOrderDates]
@i_OrderNo
,@i_ValidationDate
,@o_Return output
,@o_BeginDate output
,@o_EndDate output

--Capturing the output:
select
@o_Return 'Return Value', @o_BeginDate 'Begin Date', @o_EndDate 'End Date'

Here is the result set as expected. We got the begin and end dates for the given Order_ID (1) and the return value is 0 because the input date does not fall between the begin and end dates for that Order_ID:


Return Value Begin Date End Date
------------ ----------------------- -----------------------
0 2007-12-01 14:06:04.087 2008-06-18 14:06:04.087

Now let's take another possible situation. If we were to input an Order_ID (101) that doesn't exist in the Order_Master table, we should be getting nulls for the begin and end dates but the return value should be 0. Let's execute this scenario.

--Declaring the variables:
declare @i_OrderNo varchar(6),
@i_ValidationDate datetime,
@o_Return bit,
@o_BeginDate datetime,
@o_EndDate datetime

--Assigning values to input variables/parameters:
select @i_OrderNo = 101
,@i_ValidationDate = getdate()
,@o_Return = null

--Executing the SP with the given input parameters:
execute [dbo].[DBP_ValidateOrderDates]
@i_OrderNo
,@i_ValidationDate
,@o_Return output
,@o_BeginDate output
,@o_EndDate output

--Capturing the output:
select
@o_Return 'Return Value', @o_BeginDate 'Begin Date', @o_EndDate 'End Date'

The result set matches with the expected result:


Return Value Begin Date End Date
------------ ----------------------- -----------------------
0 NULL NULL

Since the input Order_ID, 101 does not exist in the table, the Begin and End dates are null. Since they are null, the input date can not fall between null begin and end dates and hence is 0.

Issue:

You may observe that there are so many constructs in this SP:

One SELECT statement
Three set variable statements
One usage of @@rowcount to check existence of the record
Three Begin-End blocks
Two IF and one ELSE construct

The SP seems perfectly logical. It has a very logical flow to satisfy all our requirements. But
the question is "Do we really need so many constructs to staify our requirements?". The answer is "May be Not". Remember, all we have to do is assign proper value to the output variables/parameters based on the input variables and the data in the table, Order_Master.

Let's try to simplify the same SP. Below, I have replace the IF BEGIN-END, IF-ELSE and @@rowcount with one UPDATE CASE statement. Let's see how it looks now:


if exists (select 1 from sys.procedures
where type = 'P'
and user_name(schema_id) = 'dbo'
and [NAME] = 'DBP_ValidateOrderDates')
drop procedure [dbo].[DBP_ValidateOrderDates]
GO

CREATE PROCEDURE [dbo].[DBP_ValidateOrderDates]
@i_OrderNo varchar(6),
@i_ValidationDate datetime,
@o_Return bit = 0 output,
@o_BeginDate datetime = Null output,
@o_EndDate datetime = Null output
AS
BEGIN
SET NOCOUNT ON;
update a
set @o_Return = case
when @i_ValidationDate between a.Begin_Date and a.End_Date then 1
else 0
end
,@o_BeginDate = a.Begin_Date
,@o_EndDate = a.End_Date
from (Select Begin_Date, End_Date from dbo.Order_Master with (nolock)
where Order_ID = @i_OrderNo) a

Set nocount off
END
Go

Doesn't the SP look short now? One UPDATE statement takes care of all possibilities and reduces code even though in the original SP, we are not updating any table for that matter. Neither are we updating the table Order_Master in this refined version of the original SP. All we are doing is updating the output variables / parameters based on the input parameters and the corresponding data in the table Order_Master. And that's all that is expected from this SP. So why take the longer route as in the original version? Not necessary.

Now let's execute the refined SP:

Now let's compile and then execute this SP for Order_ID = 1 and input date as today's date. This is the first case above:

--Declaring the variables:
declare @i_OrderNo varchar(6)
,@i_ValidationDate datetime
,@o_Return bit
,@o_BeginDate datetime
,@o_EndDate datetime

--Assigning values to input variables/parameters:
select @i_OrderNo = 1
,@i_ValidationDate = getdate()
,@o_Return = null

--Executing the SP with the given input parameters:
execute [dbo].[DBP_ValidateOrderDates]
@i_OrderNo
,@i_ValidationDate
,@o_Return output
,@o_BeginDate output
,@o_EndDate output

--Capturing the output:
select
@o_Return 'Return Value', @o_BeginDate 'Begin Date', @o_EndDate 'End Date'

Here is the result set as expected. We got the begin and end dates for the given Order_ID (1) and the return value is 1 because the input date falls between the begin and end dates for that Order_ID:

Return Value Begin Date End Date
------------ ----------------------- -----------------------
1 2007-12-01 14:06:04.087 2008-06-18 14:06:04.087


Now let's execute this SP for the same Order_ID (1) but with the input date not falling within the begin and end dates for that Order_ID. This is the second case above.

--Declaring the variables:
declare @i_OrderNo varchar(6)
,@i_ValidationDate datetime
,@o_Return bit
,@o_BeginDate datetime
,@o_EndDate datetime

--Assigning values to input variables/parameters:
select @i_OrderNo = 1
,@i_ValidationDate = getdate()-200
,@o_Return = null

--Executing the SP with the given input parameters:
execute [dbo].[DBP_ValidateOrderDates]
@i_OrderNo
,@i_ValidationDate
,@o_Return output
,@o_BeginDate output
,@o_EndDate output

--Capturing the output:
select
@o_Return 'Return Value', @o_BeginDate 'Begin Date', @o_EndDate 'End Date'

Here is the result set as expected. We got the begin and end dates for the given Order_ID (1) and the return value is 0 because the input date does not fall between the begin and end dates for that Order_ID(1):

Return Value Begin Date End Date
------------ ----------------------- -----------------------
0 2007-12-01 14:06:04.087 2008-06-18 14:06:04.087

Now let's take another possible situation, i.e. the third situation above. If we were to input an Order_ID (101) that doesn't exist in the Order_Master table, we should be getting nulls for the begin and end dates but the return value should be
0. Let's execute this scenario.

--Declaring the variables:
declare @i_OrderNo varchar(6)
,@i_ValidationDate datetime
,@o_Return bit
,@o_BeginDate datetime
,@o_EndDate datetime

--Assigning values to input variables/parameters:
select @i_OrderNo = 101
,@i_ValidationDate = getdate()
,@o_Return = null

--Executing the SP with the given input parameters:
execute [dbo].[DBP_ValidateOrderDates]
@i_OrderNo
,@i_ValidationDate
,@o_Return output
,@o_BeginDate output
,@o_EndDate output

--Capturing the output:
select
@o_Return 'Return Value', @o_BeginDate 'Begin Date', @o_EndDate 'End Date'

The result set matches with the expected result:

Return Value Begin Date End Date
------------ ----------------------- -----------------------
0 NULL NULL

Since the input Order_ID, 101 does not exist in the table, the Begin and End dates are null. Since they are null, the input date can not fall between null begin and end dates and hence is 0.

We get the same results with the refined version of the SP but in one statement that in multiple If's and ELSE's and @@rowcount.

The UPDATE statement in this case doesn't perform any updates to the table Order_Master itself. It just updates the output parameters to the correct values.

This is a very simple example of using unrelated operations (UPDATE) where SELECT seems the only solution and thus reducing the amount of code needed to be written and executed. This article just demonstrates that coding flexibility in Microsoft SQL Server's T-SQL is profound. Based on individual situation, there are numerous possibilities in T-SQL that can reduce code and execution time as well. We will follow some more of such techniques in the forthcoming articles.