This is a follow-up to my article How to generate a date sequence in T-SQL.
In my experience, it is frequently required to map data to points in time – display data on a periodic schedule, aggregate access log data by time intervals and so on. It goes without saying that this mapping should be handled as quick as possible. Of course performance matters.
In my previous post I discussed a method to generate these time points on-the-fly. Here I present a full solution for pre-generating a number sequence and generate a date sequence from it (which might be stored in a table as well).
Create a numbers table
First, create the numbers table using a Stored Procedure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
-- p_createNumbersTable -- Generates an optimized numbers table with numbers -- from 0 to 1.000.000 (default) or from @lowerBound to @upperBound -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - USE <YOUR DATABASE HERE> GO IF OBJECT_ID(N'p_createNumbersTable', 'P') IS NOT NULL DROP PROCEDURE p_createNumbersTable GO CREATE PROCEDURE p_createNumbersTable ( @upperBound INT = 1000000 ) -- @lower boundary always zero AS BEGIN -- Create Numbers table and Index IF OBJECT_ID(N'Numbers', 'U') IS NULL BEGIN CREATE TABLE Numbers (number INT); CREATE UNIQUE CLUSTERED INDEX IX_Numbers ON Numbers (number) WITH (ALLOW_ROW_LOCKS = OFF, ALLOW_PAGE_LOCKS = OFF) END -- table content is recreated on each call. -- It's not recurring and won't take long anyway TRUNCATE TABLE Numbers; -- create records INSERT Numbers (number) SELECT CAST(number AS INT) -- ROW_NUMBER() is BigInt so a CAST is required FROM ( SELECT number = number - 1 -- ROW_NUMBER starts with 1, subtract 1 to get 0 as lowerBound FROM ( SELECT number = ROW_NUMBER() OVER (ORDER BY number) FROM ( -- Sql Server INT max val is 2,147,483,647, so two cross joins -- on spt_values (which contains at least 2050 records) suffice SELECT number = ROW_NUMBER() OVER (ORDER BY v1.number) FROM master..spt_values AS v1 CROSS JOIN (SELECT number FROM master..spt_values) AS v2 CROSS JOIN (SELECT number FROM master..spt_values) AS v3 ) AS vu ) AS u1 WHERE u1.number BETWEEN 1 AND @upperBound + 1 ) as u END GO |
This creates the table and a unique clustered index. Locking on the Index is switched off but I guess that won’t matter much, since once created this table is only read from. Numbers always range from 0 to an upper bound, which can be set as the only parameter to the procedure.
Then, execute p_createNumbersTable
to actually create a table with numbers from 0 to 1.5 million.
1 |
EXEC p_createNumbersTable 0, 1500000 |
If the table exists, all records are purged and then recreated. Now the table may be used to derive a date sequence from it.
To further simplify querying of the table I added a short table-valued function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
-- f_numbersTable -- returns a filtered number sequence with lowerBound and @upperBound -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - USE <YOUR DATABASE HERE> GO IF OBJECT_ID(N'f_numbersTable') IS NOT NULL DROP FUNCTION f_numbersTable GO CREATE FUNCTION f_numbersTable(@lowerBound INT, @upperBound INT) RETURNS @reTable TABLE (number INT) AS BEGIN INSERT @reTable (number) SELECT number FROM Numbers WHERE number BETWEEN @lowerBound AND @upperBound RETURN END GO |
The Date Sequence Generator
Generating a date sequence is now easy as pie:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
-- f_DateSequenceGenerator -- generates a date sequence using the numbers-table -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - USE <YOUR DATABASE HERE> GO IF OBJECT_ID (N'f_DateSequenceGenerator') IS NOT NULL DROP FUNCTION f_DateSequenceGenerator GO CREATE FUNCTION f_DateSequenceGenerator (@datepart varchar(12), @fromDate DATETIME, @toDate DATETIME) RETURNS @reTable TABLE (datum DATETIME) AS BEGIN IF UPPER(@datepart) = 'MONTH' OR @datepart = 'mm' OR @datepart = 'm' BEGIN INSERT @reTable (datum) SELECT datum FROM ( SELECT datum = DATEFROMPARTS(yearNumber, monthNumber, 1) FROM ( SELECT DISTINCT yearNumber = y.number, monthNumber FROM f_NumbersTable(YEAR(@fromDate), YEAR(@toDate)) AS y CROSS JOIN (SELECT DISTINCT monthNumber = number FROM f_NumbersTable(1,12)) AS m ) AS u1 ) AS u WHERE datum BETWEEN @FromDate AND @toDate END IF UPPER(@datepart) = 'DAY' OR @datepart = 'dd' OR @datepart = 'd' BEGIN INSERT @reTable (datum) SELECT datum = DATEADD(DAY, number - 1, @FromDate) FROM f_NumbersTable(1, DATEDIFF(DAY, @fromDate, DATEADD(DAY, 1, @toDate))) END RETURN END GO |
I use these date sequences only occasionally for daily or monthly reports, so generating the sequence with a function is ok. If you need similar sequences in a productive environment with high loads it might be recommended to create a static read-only table, which probably will give you an additional performance boost.
Example usage of the Date Sequence Generator
Suppose you want to retrieve the access count per user from a log table. You need a contiguous time line for each day in a month which you get by mapping non-contiguous access log results per user to a date sequence. A query would look something like so:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
WITH accessData AS ( SELECT accessDate, userName, pageName, accessCount = COUNT(*) FROM accessLog WHERE pageName = 'HOME' GROUP BY accessDate, userName, pageName ), userDates AS ( SELECT accessDate = ds.datum, ad.userName FROM f_DateSequenceGenerator('DAY', '2014-10-01', '2014-10-31') AS ds CROSS JOIN (SELECT DISTINCT userName FROM accessData) AS ad ) SELECT accessDate = ds.datum, userName, pageName, accessCount FROM userDates AS ud LEFT JOIN accessData AS ad ON ad.accessDate = ud.Datum AND ad.userName = ud.userName |
This is fictitious but I hope that it demonstrates how simple it gets to code a previously tedious task.