Showing posts with label tuning. Show all posts
Showing posts with label tuning. Show all posts
Wednesday, May 05, 2010

Data Explosion

Hey everybody... it's been a while since I've blogged, but that's what starting a new gig will do to you. I've had so many new things to rant about here and I'm sure I'll be getting to those in the future, but right now I want to share a funny story with you.

We had a group of devs who were trying to run a fairly simple insert and it kept filling up the disk. And when I say it was filling up the disk, i mean like 50K rows were filling up like 200GB. So they came to me to see if I could fix it. This is where it gets fun.

It really didn't take me too long in my investigation to find that they had the data file set to autogrow by 131,000%. That's right, I said it! 131,000 percent. So now that I found the problem I was able to set it to something more reasonable, shrink the file and let them get on with their insert.

So it started me thinking about what other DBs in the place had similar issues because I've been hearing about disk space problems here and there. So I wrote a powershell to go out and check the file growth rates for every SQL box and I found a lot of the same type of stuff. There were a lot of them set for several thousand percent growth, several of them set for 1MB growth, and everything in between. In general, as a good generic setting to start from, I like to grow my files 1GB at a time. It's a good round number that works in a lot of cases and then you can snipe the ones that need something else. And then I altered the powershell to go out and change the growth rates of all the DB files out there to 1GB. Life is good again and we have the beginnings of a happy shiny environment.

Oh y, and in the same discovery I also found 110 DBs set to autoshrink and I also took care of those.
Tuesday, March 09, 2010

Deleting Production Data

While talking to a group of devs late last week the topic of deleting prod data came up.  Since I’m the big MVP of the group they wanted to know what the official word from on-high is.  The question is simple, and the answer is even simpler and frankly it’s one of those that was solved so long ago it just doesn’t hold any interest for me anymore.  I’ll hash it out here again though because I know there are plenty of you out there who haven’t solved this problem yet. 

The question is, what’s the best way to delete gazillions of rows out of a prod table.  The issue is with both concurrency  and rollback.  The problem is that if you do a huge delete in a single transaction you don’t leave any room for anyone else to access the table during that time.  So your delete can bring a DB to a complete halt if you don’t do it right.  As well, you put your system at risk should something go wrong and the transaction has to rollback.  So why would the transaction have to rollback?  Well, there are a couple reasons really.  The first, and most likely, is that it could fill up the log and with no more log the transaction can’t continue.  The other possibility is that the server or the service gets restarted because someone didn’t know you were doing a big operation.  And while I’m at it, there is one more logical possibility.  Someone could get impatient with the operation and stop it without realizing that it has to rollback.  This is actually a scenario that I’ve seen played out many times and the result is always the same.  The system has to wait for at least as long as it was going.  And what’s even better is when the DB doesn’t come back online right away, that same user will bounce the service again and again. 

And of course this counts for updates as well… in SQL2K this was accomplished like this:

SET RowCount 1000

While (1=1)

Begin

Begin Transaction

--Delete stmt here--

if @@RowCount = 0

Begin

   Commit Transaction

   Break

End

    Commit Transaction

End

However, this doesn’t work in SQL2K5 and above where you have to do something more like this:

set nocount on

while 1=1

begin

delete top(1000)

from table1

where conditions…

if @@rowcount = 0

end

 

So ok, that’s how you batch delete/update in prod.  Only under very controlled and specific circumstances should you ever do these types of ops in a single batch.

Thursday, November 12, 2009

More Questionable Code

A couple pieces of code came across my desk yesterday and they’re definitely the kind of thing you read about… if nowhere else, then here for sure.  This was code that a dev sent me that is to be deployed next week.

The first one could actually cause a problem so I’ll talk about it first.

I’ll write the pseudo-code because I don’t have the actual code in front of me right now.  It goes like this:

1.  Check if the Whale schema exists.

2.  If it does exist then drop it.

3.  Create the Whale schema.

Ok, so that’s exactly what the script does. Here, I’ll go ahead and attempt the code.  I didn’t look it up so it may have a slight error in it, but you get the idea.

If exists (select name from sys.schemas where name = ‘Whale’)

BEGIN

drop schema Whale

END;

Create schema Whale with authorization = ‘dbo’;

So what’s wrong with this code you ask?  It’s simple, here’s the logic.

First, if it exists and you drop it then why just turn around and recreate it in the next step?  What’s the point of that?  So if your goal is to create the schema then everything above the ‘create’ line is useless.  And I know what you’re thinking… so what?  What’s the big deal if a tiny query gets run during a deployment?  It’s not like it’s going to drag the system down.  Well, you’re right about that, but it could kill the deployment.  If that schema did exist and it actually contained objects, then the drop statement would fail until you put those objects somewhere else.  So with there being no code to check that objects exist inside, you could be dooming the deployment to an unnecessary error.  You could also say that you know that the schema doesn’t exist so there’s nothing to worry about.  That’s fine, then take out the check and the drop.  If you know it’s not there, then take it out.  It’s called having concise code and it’s something that we lack in this industry these days.  Let me illustrate this with something completely ridiculous that also doesn’t really effect performance.

Create table #t1 (col1 int)

truncate table #t1

truncate table #t1

truncate table #t1

truncate table #t1

truncate table #t1

Insert #t1 Select 1

Ok, so look at that code above.  I created a temp table and then truncated it 5x.  That’s not going to effect performance at all because there’s no data in it since it was just created and there are no pages.  Then I go ahead and insert a row.  I can’t think of anyone who would let this kind of thing go on in an SP, or in deployment code, but we’re expected to let useless checks stay in our scripts. 

This code was clearly scripted in the wizard so it’s not like the dev went out of his way to do this by hand, but it’s the mark of a fairly inexperience coder to let the wizard create scripts and not inspect what it’s giving you.

The other piece of code doesn’t really do any damage as much as it’s just annoying.  In fact, it’s really 2 different scripts.  They’re create view scripts and the first one reads something like this:

create view MyView

as

Select MyFavoriteTable.col1,

MyFavoriteTable.col2,

MyFavoriteTable.col3,

MyFavoriteTable.col4,

MyFavoriteTable.col5,

MyFavoriteTable.col6,

MyFavoriteTable.col7,

MyFavoriteTable.col8,

MyFavoriteTable.col9,

MyFavoriteTable.col10

from MyFavoriteTable as MyFavoriteTable

Ok, so even typing this in the blog pisses me off.  And again, it’s just a style thing, but this just drives me crazy.  What drives me crazy the most is that this dev doesn’t understand what aliases are for.  To alias a table with the same name as the table itself defeats the purpose of having the alias in the first place.  Here a much better alias would have been mft or mt or m.  Hell, you could even go as far as MyFT or something like that.  Here, I’ll play one of those out for you and you tell me which one you’d rather read.

Select mft.col1,

mft.col2,

mft.col3,

mft.col4,

mft.col5,

mft.col6,

mft.col7,

mft.col8,

mft.col9,

mft.col10

from MyFavoriteTable as mft

Forget reading, which one of those would you rather type?  It’s just more concise and easier to read.  I happen to know the dev who wrote this and his opinion is that the full table name makes it easier to read when you’ve got larger joins.  Personally, I’ve never known anyone to complain about short aliases before, and again, I honestly think this boils down to inexperience.  When I first started coding well over a decade ago, I used to need things to be presented to me in very specific ways too.  It was the only way I could read the code.  That kind of thing is much less important to me now that I have been doing it for a long time.  Why?  Because I know how to code.

So the second thing about this script that bugs me is the fact that he saw the need to alias a query with a single table in it.  Now you’re just being obstinate for no reason.  I know the argument though.  He probably would say that he did it for consistency.  The only problem with that is the other create view script he submitted didn’t have the same stupid aliasing structure, so where’s the argument now?

Ok, so this code was clearly brought to us as part of a code review so the next logical question is why don’t we just reject the code if it bothers us so badly?  Well, the answer is simple.  Like so many places, our code reviews are merely perfunctory process placeholders.  Our lead dev has made it very clear that he’s going to do whatever he likes no matter what we say.  We’ve rejected code plenty of times for really valid process or performance reasons and he always thumbs his nose up at us and pushes it into prod anyway.  I really don’t understand how he’s gotten such a superior attitude towards the DBAs, but it’s completely unwarranted.  He acts like we’re in this just to piss on his parade and make him look bad, but we’re really only trying to help.  But we don’t even bother anymore.  What’s the point?  He knows more than the rest of us put together so we don’t even bring stuff up anymore.

So that’s my rant for tonight.  Remember, use aliases wisely and be as consistent as you can.  And for everyone’s sake listen to what your DBAs are telling you for a change.  They’re really only trying to help.

Wednesday, August 19, 2009

The Zombie Standard

As I promised yesterday I’m gonna talk about process and coding standards today and I want you to really think about what I’m saying here because it’s what separates the men from the boys.

This topic won’t ramble on quite as much as the past 2 days have though so don’t worry.  Basically what I want to say is that you should definitely be creating coding standards and following them, but don’t rest there and don’t allow any of your standards to rest either.  What I mean by that is just because you come up with a standard, say using ‘insert into’ instead of ‘select into’ during your ETL loads, don’t always blindly follow that standard.  Every process is unique and every process deserves consideration.  You can adversely effect your servers if you do the same thing across the board without ever questioning whether this is the right tool for this particular job or not.  So where a ‘select into’ may be the perfect solution for one really huge snapshot load, an ‘insert into’ may be the right solution for other lesser loads. 

One argument I hear a lot when preaching this bit of DB wisdom is that you want to have a solid way of doing things and you want to standardize.  And frankly, that’s just ridiculous.  That argument doesn’t hold up to even the most basic scrutiny.  Ok, maybe ridiculous isn’t the word… let’s go with asinine instead.  So you’re telling me that for this big snapshot load you should use the exact same process as you do for your smaller ones because standardization is the key?  Then why aren’t all of your loads full snapshots?  Shouldn’t you be standardizing on a single load process?  And what about your indexes?  Do you have the exact same fill factor for every single index in your place?  If so you’re not paying attention to your index stats.  And are you using #table instead of @table absolutely everywhere?  Or are there cases where one is better than the other? 

So you see, with just a few examples I’ve shown you that you don’t do EVERYTHING the same way in your DBs.  So there’s no reason for you to scream standardization with something as important as a data load.  Take each load on an individual basis and decide what the best course of action is for each one.  And yeah, as the data grows or changes in nature you may end up revisiting your process and updating it to reflect the new nature of your objects.  And that’s ok.  So many people think that having to rewrite a process is a huge sin and that you’ve failed somehow.  It’s really not you know.  Not only are these types of things a learning process, but things also change that require you to change tactics. 

I really hope you take what I’m saying here and apply it.  And don’t be one of those idiots who tells me that he runs everything serially because he wrote the process on a single-CPU box and therefore he needs to standardize on that platform instead of taking advantage of a multi-CPU machine.

Watch my free SQL Server Tutorials at:
http://MidnightDBA.ITBookworm.com

Read my book reviews at:
www.ITBookworm.com

Blog Author of:
Database Underground – http://www.infoworld.com/blogs/sean-mccown

Follow my Twitter:

http://twitter.com/MidnightDBA

Tuesday, August 18, 2009

The Performance Dilemma

OK, so yesterday I was talking about how servers can take a long time to finally be brought to their knees with poor processes, and that you want to get to the point where your server is on a strict diet and you control every morsel that goes into it.

Today I want to talk about how that can play into a real performance strategy because unfortunately you have more to consider than just raw performance.  I know, I know, it sounded yesterday like I was advocating that performance was the most important thing.  And it is… but at the same time it isn’t.  So let me explain a little.

Performance is extremely important and some might even say it’s the most important thing on the server, but you also have to consider management et al.

See, quite often the code that leeches out the absolute best performance on the server isn’t as easy to manage as other solutions.  Something like this isn’t very concrete without an example so I’ll give you one.

Let’s look at what I call moving schema.  The guys in my shop are sick to death of hearing that term but it’s an important one.  Moving schema is where you drop and recreate objects all the time that should be permanent.  A good example of this is doing ‘select into’ instead of ‘insert into’.  Sure, the ‘select into’ performs better than the ‘insert into’, but it’s also harder to manage from a couple different perspectives.  For starters, even during your load you can’t specify a filegroup for the object so you’re stuck with the default.  Now for performance or growth reasons you may find it better in general if you put it on a different set of disks, but you can’t because ‘select into’ doesn’t allow it.  So from a space management aspect ‘select into’ traps you in a corner.  Also, if you have specific permissions granted on the table you have to keep track of them somewhere and reapply them every day when you recreate the table.  And you also have to remember to update your permission repository every time you change the perms on that table during the course of your other admin functions.  As well, most performance stats are gathered by objectID.  And if you’re recreating the object every day then your objectIDs are invalid tomorrow.  So you either have to keep track of what the objectID is every day so you can track performance and other issues (like fragmentation) from day to day, or you have to store the objectName in your repository as well which leads to an inefficient and larger schema to track production issues.  It also makes the timing of other processes more critical because with ‘insert into’ they’ll be blocked until the current load finishes, but with ‘select into’ they’ll fail because the object doesn’t even exist. 

So ok, I’ve shown you a few reasons why something as simple as moving schema can be a problem.  And like I said, it’s really one of my diatribes because of those issues mainly, and a couple out-lying ones.  And the guys at my shop have been hearing this battle cry for about 3yrs now and I’m sure they’re quite tired of it.  But when it comes right down to it, it’s good DB practice. 

There are some instances where the moving schema rule can be broken though.  Let’s look at a really large data load for instance.  If for some reason you can’t do any kind of incremental load and you have to do a full snapshot every night (those instances are becoming fewer and fewer with Katmai) on a HUGE table (several hundred million rows), then you may be better off doing a ‘select into’ because you don’t incur the logging and it can it can decrease your load time significantly.  It of course can also help control how big your log grows.  So this is a decision you have to make at the time, right? 

However, I will say that quite often, if you take the size of the log out of the picture, the ‘select into’ doesn’t out-perform the ‘insert into’ by all that much.  I recently tested the 2 methods against 30mill rows and the ‘insert into’ only took about 40secs longer.  That’s really not enough to worry about in the course of a full ETL load.  And for all the extra benefits you get from having a stable object, that 40secs is more than worth it.  So this would be a case where choosing the absolutely best performing process wouldn’t be the best way to go.  In the case above where I talked about loading several hundred million rows, the ‘insert into’ may increase the load time by as many as 30mins and that’s hard to ignore.  So depending on your requirements you may decide that dropping the object every day is worth it.  But in the instance where the difference is only a handful of minutes, you’d typically choose to keep your schema in tact and gain the other benefits.

So OK, you’ve got a process and you’ve decided to drop the schema every day to do your load.  And your process performs as well as it possibly could because of the lack of logging.  The problem is that people tend to think of performance in terms of the single piece they’re writing and nothing else.  Performance includes more than just the load.  Performance also includes up-time, and number of errors, and manageability, etc.  The load is just one part of the process for any given object.  Examples of different parts of an object’s process would be the load process, the reporting process, the troubleshooting process, the triage process, security process, the space management process, etc.  I’m sure there are others, but you guys get the point.  So all of these are all part of the process of managing this object and if they all don’t play well then the process for that object doesn’t work well.  So be careful in throwing all the other processes away and making them extremely more complicated just to appease a single piece.

So I’ve been in several shops and it’s always different everywhere.  Some shops are really big into management and others are really big into performance.  I’ve been in shops where the management of the objects was just a pain because the systems really were so busy that every last CPU cycle was counted and every last I/O had to be justified.  So they were willing to put up with the management headache of that kind of performance cost because they needed it to stay alive.  And again you have to decide what’s best for your place and your processes.  But whatever you decide, don’t rest on that decision.  Keep making it again and again.  I’ll talk about that tomorrow.

I guess that’s all I’ve got to say for now.

Watch my free SQL Server Tutorials at:
http://MidnightDBA.ITBookworm.com

Read my book reviews at:
www.ITBookworm.com

Blog Author of:
Database Underground – http://www.infoworld.com/blogs/sean-mccown

Follow my Twitter:

http://twitter.com/MidnightDBA

Monday, March 31, 2008

The Untunable Database

There are some DBs that just can't be tuned any more than they already are (or aren't). A good example of this is an application that hits a DB and never qualifies any of its queries. They all hit with select * and no where clause. There's really nothing you can do to increase the performance short of just throwing more spindles at it. But that's not really what I'm thinking about right now. What I've got on my mind now is an application and DB that just can't be tuned no matter what you do because the business owners don't see the benefit of making the changes.

I saw that a lot when I first got to my current gig. We had queries doing horrendous things and taking several hours to return and nobody cared. The end users had been running these queries for years and were happy with them. They didn't care that the server was maxed out all the time and that they had to wait 12hrs for a report to return. Now, I don't have to tell you that as a DBA that just drives me insane. Not to mention that it gives me nothing to do. Why am I even here then?

So with that in mind, I had to go a little cowboy on them and just start making minor changes that proved my point. I really can't stress enough that I'm against going cowboy on any DB and I don't care who you are. But there are some instances where it's warranted. You have to get the ball rolling somehow. And how this DB got in such bad shape was definitely their fault, but their current view wasn't. They had just been so used to things working the way they were that they didn't see the need to change. They got their reports more or less when they expected them and even if they had to wait a couple extra hours for them they didn't really mind because they understood the server was busy.

So what I did was just start by indexing a couple huge #tables. I just picked a couple of the worst SPs and added a couple indexes. Then I went in and started commenting out cursors and replacing them with simple join queries. Both of these made a huge difference. Then I just sat back and waited. You really don't want to go too far with something like this. Then when they started noticing that their 12hr queries were coming back in just a few secs, then I had their attention. I was then able to convince them to let me go even further and start really tearing into some of these SPs.

And now, for the first time ever, we've got a near-realtime reporting effort in our company. They've come a long way from 'I don't care if it takes 12hrs' to 'I have to have it now'. The problem is they still slip back into their old habits now and then. They currently want to implement an encryption solution that will take around 2mins to return for each report when the solution I suggested returns in about 2secs. And sure, 2mins isn't really going to break the bank, but as those of you who have done a lot of query tuning should know, you have to be hungry for resources. You have to treat every single CPU tick as a drop of water in the desert. If you don't, you'll wake up one day and be in the middle of the same shit you were in before. You have to fight over milliseconds and squeeze every last drop of performance out of every query you can find. That's the only way to run a real tuning effort.

But it's amazing how politics and perception find their way into every aspect of your job.

About Me

My Photo
Sean McCown
I am a Contributing Editor for InfoWorld Magazine, and a frequent contributor to SQLServerCentral.com as well as SSWUG.org. I live with my wife and 3 kids, and have practiced and taught Kenpo for 22yrs now.
View my complete profile

Labels

Blogumulus by Roy Tanck and Amanda Fazani

Page Views