Tuesday, March 09, 2010

Deleting Production Data

While talking to a group of devs late last week the topic of deleting prod data came up.  Since I’m the big MVP of the group they wanted to know what the official word from on-high is.  The question is simple, and the answer is even simpler and frankly it’s one of those that was solved so long ago it just doesn’t hold any interest for me anymore.  I’ll hash it out here again though because I know there are plenty of you out there who haven’t solved this problem yet. 

The question is, what’s the best way to delete gazillions of rows out of a prod table.  The issue is with both concurrency  and rollback.  The problem is that if you do a huge delete in a single transaction you don’t leave any room for anyone else to access the table during that time.  So your delete can bring a DB to a complete halt if you don’t do it right.  As well, you put your system at risk should something go wrong and the transaction has to rollback.  So why would the transaction have to rollback?  Well, there are a couple reasons really.  The first, and most likely, is that it could fill up the log and with no more log the transaction can’t continue.  The other possibility is that the server or the service gets restarted because someone didn’t know you were doing a big operation.  And while I’m at it, there is one more logical possibility.  Someone could get impatient with the operation and stop it without realizing that it has to rollback.  This is actually a scenario that I’ve seen played out many times and the result is always the same.  The system has to wait for at least as long as it was going.  And what’s even better is when the DB doesn’t come back online right away, that same user will bounce the service again and again. 

And of course this counts for updates as well… in SQL2K this was accomplished like this:

SET RowCount 1000

While (1=1)

Begin

Begin Transaction

--Delete stmt here--

if @@RowCount = 0

Begin

   Commit Transaction

   Break

End

    Commit Transaction

End

However, this doesn’t work in SQL2K5 and above where you have to do something more like this:

set nocount on

while 1=1

begin

delete top(1000)

from table1

where conditions…

if @@rowcount = 0

end

 

So ok, that’s how you batch delete/update in prod.  Only under very controlled and specific circumstances should you ever do these types of ops in a single batch.

0 comments:

About Me

My Photo
Sean McCown
I am a Contributing Editor for InfoWorld Magazine, and a frequent contributor to SQLServerCentral.com as well as SSWUG.org. I live with my wife and 3 kids, and have practiced and taught Kenpo for 22yrs now.
View my complete profile

Labels

Blogumulus by Roy Tanck and Amanda Fazani

Page Views