Wednesday, June 9, 2010

Cursors

How to Perform SQL Server Row-by-Row Operations Without Cursors



USE AdventureWorks
GO
DECLARE @ProductID INT
DECLARE
@getProductID CURSOR
SET
@getProductID = CURSOR FOR
SELECT
ProductID
FROM Production.Product
OPEN @getProductID
FETCH NEXT
FROM @getProductID INTO @ProductID
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT
@ProductID
FETCH NEXT
FROM @getProductID INTO @ProductID
END
CLOSE
@getProductID
DEALLOCATE @getProductID
GO

SQL cursors have been a curse to database programming for many years because of their poor performance. On the other hand, they are extremely useful because of their flexibility in allowing very detailed data manipulations at the row level. Using cursors against SQL Server tables can often be avoided by employing other methods, such as using derived tables, set-based queries, and temp tables. A discussion of all these methods is beyond the scope of this article, and there are already many well-written articles discussing these techniques.

The focus of this article is directed at using non-cursor-based techniques for situations in which row-by-row operations are the only, or the best method available, to solve a problem. Here, I will demonstrate a few programming methods that provide a majority of the cursor’s flexibility, but without the dramatic performance hit.

Let’s begin by reviewing a simple cursor procedure that loops through a table. Then we’ll examine a non-cursor procedure that performs the same task.

if exists (select * from sysobjects where name = N'prcCursorExample')

drop procedure prcCursorExample

go

CREATE PROCEDURE prcCursorExample

AS

/*

** Cursor method to cycle through the Customer table and get Customer Info for each iRowId.

**

** Revision History:

** ---------------------------------------------------

** Date Name Description Project

** ---------------------------------------------------

** 08/12/03 DVDS Create ----

**

*/

SET NOCOUNT ON

-- declare all variables!

DECLARE @iRowId int,

@vchCustomerName nvarchar(255),

@vchCustomerNmbr nvarchar(10)

-- declare the cursor

DECLARE Customer CURSOR FOR

SELECT iRowId,

vchCustomerNmbr,

vchCustomerName

FROM CustomerTable

OPEN Customer

FETCH Customer INTO @iRowId,
@vchCustomerNmbr,
@vchCustomerName

-- start the main processing loop.

WHILE @@Fetch_Status = 0

BEGIN

-- This is where you perform your detailed row-by-row

-- processing.

-- Get the next row.

FETCH Customer INTO @iRowId,
@vchCustomerNmbr,
@vchCustomerName

END

CLOSE Customer

DEALLOCATE Customer

RETURN


As you can see, this is a very straight-forward cursor procedure that loops through a table called CustomerTable and retrieves iRowId, vchCustomerNmbr and vchCustomerName for every row. Now we will examine a non-cursor version that does the exact same thing:

if exists (select * from sysobjects where name = N'prcLoopExample')

drop procedure prcLoopExample

go

CREATE PROCEDURE prcLoopExample

AS

/*

** Non-cursor method to cycle through the Customer table and get Customer Info for each iRowId.

**

** Revision History:

** ------------------------------------------------------

** Date Name Description Project

** ------------------------------------------------------

** 08/12/03 DVDS Create -----

**

*/

SET NOCOUNT ON

-- declare all variables!

DECLARE @iReturnCode int,

@iNextRowId int,

@iCurrentRowId int,

@iLoopControl int,

@vchCustomerName nvarchar(255),

@vchCustomerNmbr nvarchar(10)

@chProductNumber nchar(30)

-- Initialize variables!

SELECT @iLoopControl = 1

SELECT @iNextRowId = MIN(iRowId)

FROM CustomerTable

-- Make sure the table has data.

IF ISNULL(@iNextRowId,0) = 0

BEGIN

SELECT 'No data in found in table!'

RETURN

END

-- Retrieve the first row

SELECT @iCurrentRowId = iRowId,

@vchCustomerNmbr = vchCustomerNmbr,

@vchCustomerName = vchCustomerName

FROM CustomerTable

WHERE iRowId = @iNextRowId

-- start the main processing loop.

WHILE @iLoopControl = 1

BEGIN

-- This is where you perform your detailed row-by-row

-- processing.

-- Reset looping variables.

SELECT @iNextRowId = NULL

-- get the next iRowId

SELECT @iNextRowId = MIN(iRowId)

FROM CustomerTable

WHERE iRowId > @iCurrentRowId

-- did we get a valid next row id?

IF ISNULL(@iNextRowId,0) = 0

BEGIN

BREAK

END

-- get the next row.

SELECT @iCurrentRowId = iRowId,

@vchCustomerNmbr = vchCustomerNmbr,

@vchCustomerName = vchCustomerName

FROM CustomerTable

WHERE iRowId = @iNextRowId

END

RETURN

There are several things to note about the above procedure.

For performance reasons, you will generally want to use a column like "iRowId" as your basis for looping and row retrieval. It should be an auto-incrementing integer data type, along with being the primary key column with a clustered index.

There may be times in which the column containing the primary key and/or clustered index is not the appropriate choice for looping and row retrieval. For example, the primary key and/or clustered index may have already been built on a column using uniqueindentifier as the data type. In such a case, you can usually add an auto-increment integer data column to the table and build a unique index or constraint on it.

The MIN function is used in conjunction with greater than “>” to retrieve the next available iRowId. You could also use the MAX function in conjunction with less than “<” to achieve the same result:

SELECT @iNextRowId = MAX(iRowId)

FROM CustomerTable

WHERE iRowId < @iCurrentRowId

Be sure to reset your looping variable(s) to NULL before retrieving the next @iNextRowId value. This is critical because the SELECT statement used to get the next iRowId will not set the @iNextRowId variable to NULL when it reaches the end of the table. Instead, it will fail to return any new values and @iNextRowId will keep the last valid, non-NULL, value it received, throwing your procedure into an endless loop. This brings us to the next point, exiting the loop.

When @iNextRowId is NULL, meaning the loop has reached the end of the table, you can use the BREAK command to exit the WHILE loop. There are other ways of exiting from a WHILE loop, but the BREAK command is sufficient for this example.

You will notice that in both procedures I have included the comments listed below in order to illustrate the area in which you would perform your detailed, row-level processing.

-- This is where you perform your detailed row-by-row

-- processing.

Quite obviously, your row level processing will vary greatly, depending upon what you need to accomplish. This variance will have the most profound impact on performance.

For example, suppose you have a more complex task which requires a nested loop. This is equivalent to using nested cursors; the inner cursor, being dependent upon values retrieved from the outer one, is declared, opened, closed and deallocated for every row in the outer cursor. (Please reference the DECLARE CURSOR section in SQL Server Books Online for an example of this.) In such a case, you will achieve much better performance by using the non-cursor looping method because SQL is not burdened by the cursor activity.

Page 3 / 3

Here is an example procedure with a nested loop and no cursors:

if exists (select * from sysobjects where name = N'prcNestedLoopExample')

drop procedure prcNestedLoopExample

go

CREATE PROCEDURE prcNestedLoopExample

AS

/*

** Non-cursor method to cycle through the Customer table ** and get Customer Name for each iCustId. Get all
** products for each iCustid.

**

** Revision History:

** -----------------------------------------------------

** Date Name Description Project

** -----------------------------------------------------

** 08/12/03 DVDS Create -----

**

*/

SET NOCOUNT ON

-- declare all variables!

DECLARE @iReturnCode int,

@iNextCustRowId int,

@iCurrentCustRowId int,

@iCustLoopControl int,

@iNextProdRowId int,

@iCurrentProdRowId int,

@vchCustomerName nvarchar(255),

@chProductNumber nchar(30),

@vchProductName nvarchar(255)

-- Initialize variables!

SELECT @iCustLoopControl = 1

SELECT @iNextCustRowId = MIN(iCustId)

FROM Customer

-- Make sure the table has data.

IF ISNULL(@iNextCustRowId,0) = 0

BEGIN

SELECT 'No data in found in table!'

RETURN

END

-- Retrieve the first row

SELECT @iCurrentCustRowId = iCustId,

@vchCustomerName = vchCustomerName

FROM Customer

WHERE iCustId = @iNextCustRowId

-- Start the main processing loop.

WHILE @iCustLoopControl = 1

BEGIN

-- Begin the nested(inner) loop.

-- Get the first product id for current customer.

SELECT @iNextProdRowId = MIN(iProductId)

FROM CustomerProduct

WHERE iCustId = @iCurrentCustRowId

-- Make sure the product table has data for
-- current customer.

IF ISNULL(@iNextProdRowId,0) = 0

BEGIN

SELECT 'No products found for this customer.'

END

ELSE

BEGIN

-- retrieve the first full product row for
-- current customer.

SELECT @iCurrentProdRowId = iProductId,

@chProductNumber = chProductNumber,

@vchProductName = vchProductName

FROM CustomerProduct

WHERE iProductId = @iNextProdRowId

END

WHILE ISNULL(@iNextProdRowId,0) <> 0

BEGIN

-- Do the inner loop row-level processing here.

-- Reset the product next row id.

SELECT @iNextProdRowId = NULL

-- Get the next Product id for the current customer

SELECT @iNextProdRowId = MIN(iProductId)

FROM CustomerProduct

WHERE iCustId = @iCurrentCustRowId

AND iProductId > @iCurrentProdRowId

-- Get the next full product row for current customer.

SELECT @iCurrentProdRowId = iProductId,

@chProductNumber = chProductNumber,

@vchProductName = vchProductName

FROM CustomerProduct

WHERE iProductId = @iNextProdRowId

END

-- Reset inner loop variables.

SELECT @chProductNumber = NULL

SELECT @vchProductName = NULL

SELECT @iCurrentProdRowId = NULL

-- Reset outer looping variables.

SELECT @iNextCustRowId = NULL

-- Get the next iRowId.

SELECT @iNextCustRowId = MIN(iCustId)

FROM Customer

WHERE iCustId > @iCurrentCustRowId

-- Did we get a valid next row id?

IF ISNULL(@iNextCustRowId,0) = 0

BEGIN

BREAK

END

-- Get the next row.

SELECT @iCurrentCustRowId = iCustId,

@vchCustomerName = vchCustomerName

FROM Customer

WHERE iCustId = @iNextCustRowId

END

RETURN

In the above example we are looping through a customer table and, for each customer id, we are then looping through a customer product table, retrieving all existing product records for that customer. Notice that a different technique is used to exit from the inner loop. Instead of using a BREAK statement, the WHILE loop depends directly on the value of @iNextProdRowId. When it becomes NULL, having no value, the WHILE loop ends.

Conclusion

SQL Cursors are very useful and powerful because they offer a high degree of row-level data manipulation, but this power comes at a price: negative performance. In this article I have demonstrated an alternative that offers much of the cursor’s flexibility, but without the negative impact to performance. I have used this alternative looping method several times in my professional career to the benefit of cutting many hours of processing time on production SQL Servers.

0 comments:

Popular Posts

Recent Posts

Unordered List

Text Widget

Blog Archive