Search
Close this search box.

Calculating Running Totals in SQL Server 2005, The optimal solution?

Using “Update to a local variable” to calculate running totals in SQL.

Recently I was looking at an existing view on a client’s SQL server 2005 database. This view calculated the running total for a transaction amount from a table, but was performing very poorly.

I had always believed there were three different methods for calculating a running total using TSQL:

1.     Use a nested sub-query

2.     Use a self join

3.     Use Cursors

My own personal preference was to use the cursors option. If the cursor guidelines are followed, I’ve always found this to be the quickest, because the other two methods involve multiple scans of the table. The key for the cursor method is to ensure the data you are “cursoring” through is in the correct order, as the query optimzier does not understand cursors. This usually means cursoring through the data by clustered index, or copying the data into a temp table / table var first, in the relevant order.

A blog posted by Garth Wells back in 2001 gives these three techniques (http://www.sqlteam.com/article/calculating-running-totals)

I came across a fourth technique for the running total calculation, which is related to the cursor method. Like the cursor method, it involves a single scan of the source table, then inserting the calculated running total for each row into a temp table or table variable. However, instead of using a cursor, it makes use of the following UPDATE command syntax:

UPDATE table

SET variable = column = expression

The TSQL to calculate the running total is:

DECLARE @SalesTbl TABLE (DayCount smallint, Sales money, RunningTotal money)

DECLARE @RunningTotal money

SET @RunningTotal = 0

INSERT INTO @SalesTbl 

SELECT DayCount, Sales, null

FROM Sales

ORDER BY DayCount

UPDATE @SalesTbl

SET @RunningTotal = RunningTotal = @RunningTotal + Sales

FROM @SalesTbl

SELECT * FROM @SalesTbl

I tested this query along with the other three methods on a simple set of test data (actually the same test data from Garth Wells’ blog mentioned above).

The results of my test runs are:

MethodTime Taken
Nested sub-query9300 ms
Self join6100 ms
Cursor400 ms
Update to local variable140 ms

I was surprised just how much faster using the “Update to a local variable” method was. I expected it to be similar to the cursor method, as both involve a single scan of the source table, and both calculate the running total once only for each row in the table. The Nested Sub-query and Self join methods are so much slower because they involve the repeated recalculation of all of the previous running totals.

Note: There is a pretty big assumption in using the “Update to local variable” method. This is that the Update statement will update the rows in the temp table in the correct order. There is no simple way to specify the order for an Update statement, so potentially this method could fail, although I have not seen this actually happen yet!

think that if I use a table variable, then the update will probably be in the correct order, because there are no indexes for the query optimizer to use, and parallellism will not occur. However, I can’t be sure about this!

The following script was used to create the test data:

CREATE TABLE Sales (DayCount smallint, Sales money)

CREATE CLUSTERED INDEX ndx_DayCount ON Sales(DayCount)

go

INSERT Sales VALUES (1,120)

INSERT Sales VALUES (2,60)

INSERT Sales VALUES (3,125)

INSERT Sales VALUES (4,40)

DECLARE @DayCount smallint, @Sales money

SET @DayCount = 5

SET @Sales = 10

WHILE @DayCount < 5000

BEGIN

INSERT Sales VALUES (@DayCount,@Sales)

SET @DayCount = @DayCount + 1

SET @Sales = @Sales + 15

END

The queries used in my tests for the other three methods are posted below:

1.     Nested Sub-query

SELECT DayCount,

       Sales,

       Sales+COALESCE((SELECT SUM(Sales)

                      FROM Sales b

                      WHERE b.DayCount < a.DayCount),0)

                         AS RunningTotal

FROM Sales a

ORDER BY DayCount

2.     Self join

SELECT a.DayCount,

       a.Sales,

       SUM(b.Sales)

FROM Sales a

INNER JOIN Sales b

ON (b.DayCount <= a.DayCount)

GROUP BY a.DayCount,a.Sales

ORDER BY a.DayCount,a.Sales

3.     Cursor

DECLARE @SalesTbl TABLE (DayCount smallint, Sales money, RunningTotal money)

DECLARE @DayCount smallint,

        @Sales money,

        @RunningTotal money

SET @RunningTotal = 0

DECLARE rt_cursor CURSOR

FOR

SELECT DayCount, Sales

FROM Sales

ORDER BY DayCount

OPEN rt_cursor

FETCH NEXT FROM rt_cursor INTO @DayCount,@Sales

WHILE @@FETCH_STATUS = 0

 BEGIN

 SET @RunningTotal = @RunningTotal + @Sales

 INSERT @SalesTbl VALUES (@DayCount,@Sales,@RunningTotal)

 FETCH NEXT FROM rt_cursor INTO @DayCount,@Sales

 END

CLOSE rt_cursor

DEALLOCATE rt_cursor

SELECT * FROM @SalesTbl

This article is part of the GWB Archives. Original Author: Robin Hames

Related Posts