Due to Internet Spaceships (aka, Spreadsheets in Space), whenever I hear the “I got 99 problems” line, I always think of a station that we had in a solar system whose name started with “9-9.” The alliance that I flew under had a naming standard of the first 3 characters or so of the station name matched the system name to cut down on confusion and help you confirm that you knew where you were. That station was named “99 problems but <…> ain’t one.” I don’t remember what was in the gap anymore; there’s a decent chance that it wasn’t family friendly anyway.
Anyway, I’m a huge nerd.
This time around for what is becoming Tom’s monthly writing exercise, the topic is to make a list of things that can go wrong with SQL Server that are not disk issues. This should be fairly easy for me, since we have more I/O than God, fortunately, but we’ll see how much of a trainwreck I can turn this into.
- Cowboy Sysadmins. I know this is asking for it. I know that there are plenty of good sysadmins out there who are just as pragmatic as good DBAs are. The problem is that they’re not always the only ones working on a project or active on the support rotation. Things can get changed that shouldn’t be when they shouldn’t be. It happens. How do I know this? Well, see, I used to be a Cowboy Sysadmin. I used to be a pretty strong opponent of ITIL and things like Change Management. Then I got a clue and things were better.
- Cowboy Developers. Self-explanatory.
- Linked Servers. I know Tom listed this one in his original post, but it really is a disaster waiting to happen. Long story short, I wound up implementing an LS over a WAN link, and it’s a miracle that worked. There are a lot of moving parts involved in LSes, and when a cluster is involved, it’s even worse. PLUS, when name resolution isn’t working through other means, hostnames need to be added to LMHOST (LMHOSTS, not HOSTS (!)). That says to me there’s some ancient piece of code in MSDTC in use and that scares the crap out of me.
- Crap code/Design. And by “Crap,” I mean, “Legacy.” Why is old code always the worst? Were the people here before really that dumb, or did they just not know any better? I mean… the storage engine has to do the exact same thing with 9 indexes on a table now that it had to do a decade ago… (no, I’m not kidding).
- Letting me design your DB. I’m better at this than I was two years ago, but you should still have someone who knows what they’re doing look at it before you do anything else with it.
- Reporting out of your OLTP system. This may be OK…but it may be very, very bad. I’ve seen some doozies, but sometimes it isn’t completely avoidable. Limit it as much as you can. You don’t even need to go full-blown data warehouse or data mart for this, either; log-shipped or a backup restored on another instance may get the scary queries off your back.
- Flakey alerting/monitoring system. This doesn’t directly affect your company’s DBs, but if a backup job failed last night and you didn’t get an email alert about it… well, things may not be OK, would they?
Tom was hoping to get nine out of us, but I’m tired, this is due tomorrow, and the President is apparently going to drop a bomb on us in five minutes, so this is what you get.
Tom, this whole thing is kind of fun and it really does give me/us easy fodder to write about, so I appreciate it.