Using a While Loop to Segment a Batch Process

When running very large queries, particularly updates or deletions, it is helpfull to give feedback on the status of the process. One way of doing this is to run the query within a while loop and set ROWCOUNT to control the number of rows processed on each pass.

The following shows the basic structure of such a loop...

DECLARE @MaxRows  INT,
        @DoneRows INT,
        @SybError INT

SELECT
@MaxRows  = 500,
       @DoneRows = 0,
       @SybError = 0

WHILE 1=1
   BEGIN
      SET ROWCOUNT @MaxRows

      UPDATE ThisTable
         SET ThisColumn = ThatColumn
        FROM AnotherTable

      SELECT @DoneRows = @@rowcount,
             @SybError = @@error

      IF @
SybError != 0
         {Local error handling }

      PRINT "Processed %1! rows", @DoneRows

      IF @DoneRows < @MaxRows
         BREAK

   END

So, What's Happening Here?

We start by setting up the variables we need. The first variable, @MaxRows , will set the number of rows to process at a pass; the second, @DoneRows , records how many rows were actually processed and the third,@SybError, we use to capture any error generated by the query.

Then we start an endless loop (WHILE 1 = 1). We could use a variable to control the loop but this way we don't need to worry about start conditions - the loop will always run at least once, which is what we want.

Within the loop, we set Sybase's internal ROWCOUNT to the value we want. We could do this with a constant but, in real life, we'd probably set @MaxRows as a parameter to the procedure so that we could tune the value as we build up an idea of how the process performs.

Next, we run the query and capture the internal result, i.e. the number of rows actually processed and any error generated. Immediately after this, we handle any errors. The simplest thing to do would be to report the error and issue a 'RETURN' command to exit from the procedure but you could get very creative if you wished, here. For example, it would be wise to wrap the query and error handling in a transaction. Then, if the query generates an error, the error handling could roll it back. If the query completed safely, it could then be committed.

Assuming no error occurred, we report the number of rows we processed to the console. Finally, we check to see if we processed fewer rows than the maximum we set. If we did, then we've exhausted the query so we leave the loop via the BREAK command, otherwise, we go around the loop again.

Advantages and Disadvantages.

This form of processing is extremely usefull when running very large batches, as the reporting provides support personnel with a clear view of how the batch is performing. If you provide time information in the output lines, you can see where there are bottlenecks and concentrate your tuning efforts on them.

It's not appropriate for jobs that need to run silently, whatever the cost. One example of this is a trigger. Because the trigger may be fired by any process, you can't safely write to the console. Doing so may interfere with whatever output is already being displayed.



Find out more by searching Google here...


Google