T-SQL Tuesday #128: Learn From Others

Pilots do something that a lot of non-pilots will find fairly weird if not outright horrifying: We read accident (“crash”) reports. Some of us spend a lot of time reading accident reports, actually. Officially “Accident Reports”, these are put out by the US National Transportation Safety Board (NTSB) after investigation into a crash or an “incident.” In addition to aviation-related reports, there are highway and railroad reports, and even hazardous materials incidents.

Reports come in two flavors, a “preliminary” report, and ultimately, a “final” report after the investigation has completed. The final reports includes such items as conclusions and the probable cause of the accident or incident. To make life easier, they also include a Recommendations section, which, well, includes recommendations for how to keep this type of accident from happening in the future. These tend to be regulatory in nature, as they are geared towards the FAA.

The search form for aviation reports is here–https://www.ntsb.gov/_layouts/ntsb.aviation/index.aspx–if you’re, uh, thinking you want to get into this sort of thing.

Why do pilots do this? The rationale is pretty simple: To learn from the mistakes of others. Or, to learn how a bad day was kept from becoming a worse day after something broke.

What Does This Have to Do With SQL Server?

Great question. Besides the fact that I think piloting airplanes and DBA-ing are the same job, just with different scenery,  I wish we had this kind of transparency in the IT world when things went wrong. When a corporation has a big security incident, we’re likely not to hear a lot of details publicly about what went wrong and what was done to mitigate similar attacks in the future. This kind of information could help everyone. This is one of the things that cloud providers do quite a bit better: When something breaks, we get good information on what happened, why, and what’s going to be done about it. Of course, this is done because public cloud providers basically have to–if things went down a lot and we never heard why, that provider probably wouldn’t have a lot of customers for very long.

This brings me to T-SQL Tuesday.

Tell me (us all, obviously) about something recently that broke or went wrong, and what it took to fix it. Of course, the intent here would be for this to be SQL Server-related, but it doesn’t have to be. We can all learn from something going wrong in infrastructure-land, or how there was a loophole in some business process that turned around and bit somebody’s arm. It doesn’t even have to be all that recent–maybe you’ve got a really good story about modem banks catching on fire and that’s how you found out the fire suppression system hadn’t been inspected in years. Just spitballin’ here. If you’ve got an incident whose resolution can help someone else avoid the same problem in the future or improve a policy as a preventative measure, let us hear about it.

The Rules

Here are the rules as set out for the T-SQL Tuesday blog party.

  1. Your post should be published on Tuesday, 14 July, 2020 between midnight and 11:59:59 UTC/GMT/ZULU
  2. Include the T-SQL Tuesday logo in your post
  3. Link back to this invitation (usually done through the logo)
    (this will get syndicated, so link back to the original on airbornegeek.com, please)
  4. Include a comment on the invitation post or a trackback link
  5. Enjoy the chance to be creative and share some knowledge.

“The Tuesday Night Fire Code Violation”

It was July 19, 2005. At least, I’m pretty sure it was.

Based on IndyPASS’s meeting history, that second meeting way down at the bottom (use your keyboard’s End key; that’s what it’s there for) was basically a “here’s what’s new/awesome in SQL Server 2005” presentation. I’ve long since lost most of my email from that time, but that meeting makes sense in the timeline of 2005’s release.

During the dark, dark days of 2005, just about everyone was desperate for an upgrade to SQL 2000. I was, and I hadn’t even been here that long. The fledgling Indianapolis PASS chapter met in a good-sized conference room on the ground floor of a Duke-owned office building off Meridian St (“twelve o’clock on the I-465 dial”) on the north side of town. That night, there were probably half-again as many people in that room as it could comfortably hold. People standing, sitting on the floor, you name it. Tom Pizzato, the speaker, was introduced; he walked up to the podium and the first thing he said was, “Welcome to the Tuesday night fire code violation.” That is still the best one-liner to open a technical presentation I’ve ever seen, and ever since, it has been cemented to SQL Server 2005 itself in my brain.

That was a long time ago–It’ll be eleven years here in a couple months. Eleven years is an appreciable percentage of an eternity in the tech world. As a result, earlier this week, Extended Support for SQL 2005 ended. This means that you, if you are still running it anywhere, will get no help from Microsoft were something to go wrong. Perhaps more importantly, there will be no more security patches made available for it. Don’t expect if something big happens, there will be a replay of what Microsoft did for XP.

This is a pretty big deal. If you have any kind of problem that you can’t fix, and you call Microsoft Support about it, you won’t get any help for your in-place system. You will have to upgrade to a supported version before you’ll be able to get any assistance, and in the middle of a problem bad enough to call PSS probably is not the time you want to be doing a Cowboy Upgrade™ of your production database system.

I understand that there are plenty of industries and even some specific companies that are either forced to, or elect to continue to run out-of-support RDBMSes on their mission-critical systems. I supported SQL 2000 for far longer than I would like to admit, and it was a risky proposition. After I transitioned out of that role, there was a restoration problem (fortunately on a non-production system) that it sure would have been nice to be able to call Microsoft about, but that wasn’t an option.

Don’t put yourself in that situation. There are plenty of points that can be made to convince the powers that be to upgrade. The fact that any new security vulnerability will not be addressed/patched should be a pretty good one for most companies. If you have an in-house network security staff, loop them in on the situation; I bet they will be happy to help you make your case.

One final note: If you are still running 2005 and are looking to upgrade, don’t just hop up to 2008 or 2012–go all the way to 2014 (or, once it goes Gold, 2016). SQL Server 2008 and 2008 R2 are scheduled to go off Extended Support on July 9, 2019. Three years seems like a long way off now, but that’ll sneak up on you…just like April 12, 2016 might have.

Here Be Dragons: New Cell Phone

Cell phones in my family go way back…Pretty much as far back as is possible in the US:

Ameritech Phone

1984 called...

I don’t remember anymore if that was my dad’s first or second phone. I have no idea what the deal with the ammo box is. I do remember that the handset clip would get bolted to that little bracket on the lid & then could leave the box sit in the middle floor of his pickup and the phone would be right there. Yeah. AMPS analog. 3 watt transmitter. Good thing the antenna was on the roof. Those were the early Ameritech days. The phone wouldn’t even get a signal at the house—he had to get up towards Rensselaer (Indiana, look at a map, see the boondocks where I’m from) before he’d be able to use the thing.

‘Course, back then everyone called them “car phones.” It stayed in the car, and at one point was even hardwired to the vehicle so that if he wasn’t in the truck at the time, it would honk the horn if someone rang. That seems almost mind-blowing to me now. Once the whole 3 watt thing started to go away, things got smaller, and they started to come in little nylon bags (“bag phones”) that were more self-contained, and at least had little rubber duck antennas directly attached to them. It was on a BNC connector, so you could plug in the roof antenna if you didn’t like having what was probably still a whole watt radio broadcasting from the general vicinity of your right knee.

Anyway, in the mid-90s, I was assigned a Motorola DPC 650 by my parents when I started to drive (found out that Tammy had a 650 at about the same time I did). That thing was still a beast, and pretty much never left the truck. That phone began my love affair with Motorola flip phones served by our old friend GTE Communications. Through high school I went through a number of StarTAC 7860 and -68s (wore the phone, so broke a lot of antennas and other bad things happened). To this day, I sometimes wish for the simplicity of a StarTAC.

In summer of 2000, while I was in college, I got my own account and number with GTE. That brought the horrid piece of crap that was the T720. I don’t even want to talk about it. Still have the phone though; it’s pretty funny to fire it up and look at it now. It had a one-line display on the outside so you could see who was calling (OOOOO!). I also seem to recall you could install 3rd party applications on it, but I don’t remember much about that. Somewhere in here GTE became Verizon, of course. I can still hear James Earl Jones say, “Welcome to Verizon Wirelesssss…”

Things looked up after that: Motorola V710. This thing is honestly probably my favorite phone of all time. It was solid, its radio was great, and had ridiculously awesome call quality. It had a big display on the inside, a decent-sized one on the outside, and really good battery life. My first one of those met an untimely end when I wound up running over it with my pickup. While it was open. Face down. Did I mention that it was just at the right place that the front right tire of the pickup sat ON the phone for a few hours? The fun thing about that is that the phone still worked (!), and I used it like that for a few weeks before it decided it was done. Yay phone insurance!

The Smartphone Years

Because I was addicted to my job at the time, and liked the idea of getting mail in my pocket, I bought a Samsung i760 in December 2007. That is a Windows Mobile 6.1 device with a slide-out keyboard. I bought that phone because I wanted my phone to be just another Outlook client. I didn’t want to have to fight with some third-party Exchange connector and Blackberries were right out from the get-go. I have a giant rant about RIM and how I cannot comprehend how so many companies rely on their system for mail, but that’s a different story.

This was a big departure for me for a couple reasons: It was the first non-Motorola flip phone that I had ever owned, as was it my first smartphone. I was pretty worried about this at the time. As it turned out, though, everything was great. OK, except for the call quality. This was definitely a smartphone first. The radio wasn’t all that good, either. However, within about 10 minutes of having it home, I was scrolling (with its stylus, hahahaha) through my Exchange mailbox, which I thought was the coolest thing since sliced bread at the time. You know what else I liked about the phone? Windows Mobile 6.1. There, I said it.

Three years later, I jump off a cliff…

After ten years with the same Verizon account, phone number, and I’m quite certain, voicemail message… I switched to the company that carries Ma Bell’s bastardized name and a completely unproven phone OS on a device that doesn’t have a hardware keyboard. This could go terribly.

Yes, I bought a Samsung Focus with Windows Phone 7. Tammy and I both got one (ATT BOGO Black Friday deal). This was brought on by our house (The Osburn Hideaway) being in a bizzaro black hole where there is no Verizon coverage, Tammy and I wanting to combine to a family plan, and, well, ATT having WP7. Because at the end of the day, I’m just that big of a fanboi.

Everyone on Twitter knows that BrentO just loves Windows Phone 7. I do agree with just about everything that he says about it. We’re taking huge gambles that someone at Microsoft has this dev team’s throttle on the floor and all of the shortcomings the device has will be taken care of soon (I’m not going to talk about anything specific, because I have the same grumps that everyone else has, and all of those people are better writers than I am). The rumored sales numbers don’t look all that good so far, but I don’t know that anyone is all that surprised by that. If, after the first major update comes out and possibly another round of hardware, those numbers don’t start to go up…then I’m going to be pretty worried. I don’t expect iPhone-like numbers, as that device changed the freaking world, and it’s the likely-not-to-be-beaten incumbent.

So, that’s my cell phone story. I don’t expect to write about this much more, because like I said above, I’m really not smart enough to come up with anything new here on my own, so you’ll be able to see what I’m going through by everyone else writing about the OS and poo-pooing on its poor adoption rate. I mean, unless something really bad happens and I bail early.

…and if this doesn’t work out and that happens, I’ll… <deep breath>…probably get an iPhone.

Maintenance Plans: They don’t like it when DBAs leave

Discovered some…interesting…information about Maintenance Plans a few weeks ago when we had a DBA leave. I don’t think it warrants being called a “bug”, but it is behavior that I don’t think is ideal, and it definitely can cause issues for us and others. The good news here is that I did get a workaround nailed down (thanks to the help of Random Internet Guy), although we haven’t decided whether or not we’re going to use it.

How this all started

Recently, we had one of our DBAs leave. (Incidentally, this is the first time ever that I’ve had this happen to me and also the first time one has left this company.) After he left at the end of his last day, I pulled all of his access to the Production systems. We knew that any Agent jobs that he happened to own would start failing, and sure enough, there were a few failures that evening and over the course of the weekend.

One of these failures turned into a little more of an involved situation: The Maintenance Plan on the only production SQL 2008 box he set up failed. With us being used to SQL 2000, this was expected to be an easy fix and life would go on. Except, it wasn’t.

Toto, I’ve a feeling this isn’t SQL 2000 any more

First attempt to fix this was to change the owner of the SQL Agent Job to ‘sa’. He’s our standard job owner for consistency and because that makes the job impervious to people leaving. The job still failed, although the error was different this time. Obviously the Agent Job owner had some role in all of this, but it wasn’t the whole story, as things weren’t fixed.

After changing the Job owner, this is the error that occurred:

Executed as user: <SQL Service acct>, [blah blah blah]  Started:  11:25:38 AM  Error: 2010-10-06 11:25:39.76     Code: 0xC002F210     Source: <not sure if this GUID is secure or not> Execute SQL Task     Description: Executing the query “DECLARE @Guid UNIQUEIDENTIFIER      EXECUTE msdb..sp…” failed with the following error: “The EXECUTE permission was denied on the object ‘sp_maintplan_open_logentry’, database ‘msdb’, schema ‘dbo’.“. Possible failure reasons: Problems with the query, “ResultSet” property not set correctly, parameters not set correctly, or connection not established correctly.  End Error  Warning: 2010-10-06 11:25:39.79     Code: 0x80019002     Source: OnPreExecute      Description: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED.  [blah blah blah] Error: 2010-10-06 11:25:40.58     Code: 0xC0024104     Source: Check Database Integrity Task      Description: The Execute method on the task returned error code 0x80131501 (An exception occurred while executing a Transact-SQL statement or batch.). The Execute method must succeed, and indicate the result using an “out” parameter.  End Error  Error: 2010-10-06 11:25:40.64     Code: 0xC0024104     Source: <another GUID>      Description: The Execute method on the task returned error code 0x80131501 (An exception occurred while executing a Transact-SQL statement or batch.). The Execute method must succeed, and indicate the result using an “out” parameter.  End Error  Warning: 2010-10-06 11:25:40.64     Code: 0x80019002     Source: OnPostExecute      Description: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. [blah blah blah] End Warning  DTExec: The package execution returned DTSER_FAILURE (1).  Started:  11:25:38 AM  Finished: 11:25:40 AM  Elapsed:  2.64 seconds.  The package execution failed.  The step failed.

A Permissions error in MSDB? Huh?

I dug a little, and the short of it is that the Maintenance Plan’s connection string was still set to the departed Admin’s sysadmin account (a SQL login). All of the work that the MP was trying to do, it was still attempting as the other DBA. This, of course, isn’t going to get very far.

Maintenance Plans have connection strings?

This seems like a dumb question to ask, but it’s one of those things that I had glossed over in my head and not actually thought about. The answer: Sure they do! MPs are basically SSIS packages these days, and of course those have connection strings/settings. MPs are no different. Seems like a “DURRR” now, but hindsight, yadda, yadda.

This connection string is controlled by the “Manage Connections” button in Maintenance Plan toolbar. If you never mess with this when you create an MP, it uses an auto-generated connection to the Local Server, using whatever name you used to connect to it (there’s another topic right there, but someday I’ll talk about some quirks with Aliases I’ve run into, and I’ll address that then). This auto-gen’d connection also uses the same auth credentials you are currently using. If you only use Windows Auth, you are stuck with that; if in Mixed Mode, a SQL auth account is an option, and you can set it to another account. Of course, you would need to know that account’s password.

To fix our MP’s connection string, I could change it to my own credentials, but that of course doesn’t actually fix anything, and will leave an even bigger mess for whoever’s left if/when I leave or responsibilities change. I can’t set it to Windows Auth, because when you select that, it validates the auth right then and there. Since we don’t have Windows Auth SA accounts, that won’t work for us: the AD account that I’m logged on with doesn’t have enough rights. Something else needed to be done.

At this point I realized that this is a fair-sized problem and we can’t be the first shop to run into this; I mean, SQL 2005/2008 have been out for a while, and I’m sure lots of DBAs have left shops that heavily utilize MPs. Time to ask the Senior DBA.

Going nowhere fast

For as common as I expected this to be, I wasn’t finding much helpful information. There were some references to this error message, and even someone in the same situation I was in, but most of what I found were dead-end threads.

At one point, I thought for sure that @SQLSarg had it. This blog post contains the revelation that MPs have Owners! His main point of changing the MP owner is to change the owner that the associated Agent Job(s) are set to when you save the MP. Even though I had already changed the job owner to a valid account, I hoped that the MP’s owner had something to do with the credentials that were used in the Plan’s connection string.

Of course, it didn’t. No behavior change.

Aside: Even though it didn’t help with what I was trying to fix at the time, I believe that changing an MP’s owner to a shared or central service account is a good idea. Every time an MP gets modified, the Agent Job’s owner is changed to the same account as the Plan’s Owner. This might not even be you, and when that person leaves the company, you are guaranteed to have execution problems. There’s a Connect item open asking to make it easier to change the owner, so if you like Maintenance Plans and don’t hate yourself, it’s probably a good idea to log in over there and mash the green arrowhead.

The more I dug, the more dead-end threads I read, the more almost-but-not-quite blog posts I ran across, the more I became reserved to one simple fact:

It is impossible to set the Connection String Credentials in a Maintenance Plan to an account that you don’t know the password for.

As that started to sink in, I asked #SQLHelp if that were, in fact, true. I got a couple responses that yes, that was the case (I would be specific, but this was a number of weeks ago, and since Twitter sorta sucks for finding things like that and I can’t remember who it was, we’re out of luck). Although not the news I was looking for, I was glad to know that the deal was pretty much sealed—we were going to need to figure out something else to do.

One last gasp

Since I don’t feel like I’m doing everything I can while troubleshooting a problem if I don’t grab for at least one straw, I decided to investigate something that was described in one of the dead-end threads that I had found while searching.

This thread on eggheadcafe is a conversation between “howardwnospamnospa”* and Aaron Bertrand, who we all know and love. This one “howardw” was in a pretty similar boat to what I was in, and was really close to getting it to work. He was only thwarted because of an apparent quirk when an Instance is in Windows Auth-only mode.

In the last post in the thread, he reports that when he assigns the Local System account (“NT AUTHORITY\SYSTEM”) as owner of the MP, and sets the connection properties of the plan to use Windows Auth, the plan runs. To attempt to duplicate this, I started with an existing MP on a Dev instance that I had set up (therefore my SQL auth account was the MP owner and the user listed in the Connection Properties) and proceeded with the following steps:

  1. Using the statements in SQLSarg’s post, I changed the owner of the MP to Local System
  2. I added my Domain account to the Instance with SA rights
  3. I opened the MP in question with SSMS, switched its connection to use Windows Auth (while logged in with the Domain account used in Step 2), and saved it.
  4. Checked the MP’s auto-gen’d Agent Job to confirm that its owner was NT AUTHORITY\SYSTEM.
  5. Using a connection with my Domain account, I pulled my SQL account’s SA rights. This would previously make the MP sub-plan fail.

I ran the job and, holy cow, it worked! 😀

That’s good, right?

Ehhhh, maybe. Setting the owner to Local System seems pretty dirty. I haven’t thought of a way that this would be a security problem, but it still feels like a stupid hoop to jump through. For us, it has the extra hoop of us having to add our Domain accounts to the instances every time we want to set up or change connection info on an MP (once the Windows Auth is set, it doesn’t re-validate it when you save the plan, so you only have to do it that one time), then take it out when we’re done.

Those bits aside, this workaround (if you will let me get by without calling this a “hack”) does allow setting up Maintenance Plans that will live through the DBAs that set them up leaving the company. By technicality, I achieved my goal.

We’ve talked about this setup very briefly at the office, but haven’t made a final decision on whether or not to go down this road or to abandon MPs for DB maintenance altogether. Hopefully we will get that decision made before someone else leaves.

But at what cost?

The main cost here is a lot of destruction to my affinity for Maintenance Plans.

Using MPs in general, love them or hate them, is pretty much a religious debate. Usually I fall on the proponent side of those arguments. I think they’re easy to set up, they’re pretty flexible, they can dynamically pick up new DBs that get added to the system (not always an advantage), you can cook up some really crazy dependencies within them, and I like pictures & flowchart-y things :-).

After this whole ordeal? Pretty sure I’ve changed my mind. This whole default behavior of how they will stop working when the account that set them up initially is no longer an SA is way too much of a price to pay. What happens when you’re a single DBA shop and you get honked and bail one day? Near as I can tell, if your account gets restricted as it should, backups will stop working, but there won’t be anyone left to notice!

In lieu of MPs, there’s at least one good option out there that has most all of the desired functionality. While this was going on, @TheSQLGuru posted a link to this set of SPs. I haven’t had time yet to really dive into these, but on the surface, they look pretty good.

Final Comment(s)

This entire thing bugs me. I’m almost certain I’m missing something. This should be a huge problem for lots of people. I feel like I’m missing a detail or it’s because we use SQL logins for the most part or some other thing. But… in all of the testing that I’ve done, bad things happen when the initial creating user gets SA pulled. Period. I can’t get past the feeling that I’m doing something wrong, but I haven’t been able to figure it out yet.

In the meantime… I think Maintenance Plans are evil after all…

* Apparently not to be confused with howardwSOLODRAKBANSOLODRAKBANSO[LODRA] (Eve-Online thing; it’s honestly better if you don’t ask)

Restart! [<BOOM>]

I agonized for a while with what to do for the first sentence for my attempt to re-enter the blogging community and finally deciding to wimp out and write about the opening sentence instead. Goal #1: Completed!

As I said, this is technically the second blog that I’ve run. The first one I started a number of years ago, with no firm goal in mind—I was just going to mumble about whatever was going on or whatever I felt like talking about. I didn’t do too bad getting things written, but since it was somewhat lacking a purpose in life, I got a little bored with it. This time around, I’m going to focus on writing technical content, as it should help me to learn more about the things that I do every day. Of course, there’s also the part about how all of the cool kids in #sqlhelp are doing it.

I had been kicking around the thought of doing this for a while, and actually it was Mike Reigler’s first post about SQL Cruise that finally pushed me over the edge. Reading his awesome post about the cruise and how he decided it was worthwhile to go for whatever reason finally made me decide that I wanted to do it, too. I’m in no way, shape, or form a good writer, so I don’t know how well this will go over, but I’m going to subject you guys to it, anyway (at least until you decide this is lame).

My goal here at the beginning is to get a post up once a week. It’s not very much, but since I don’t do a lot of crazy stuff at work, I will be doing good if I can pull that off. Who knows, maybe I’ll start to get a bunch of cool stuff going on and I’ll be able to pound stuff out all the time. I hope that I can keep it technical and useful enough to live up to the high standards of SQL bloggers 🙂

The <BOOM> Thing…

This is going to be one of those long crazy stories that I have that will only make sense to me and nobody will read all of. It’s OK, I understand.

The Boom… I used to eat, breathe, and sleep aerospace-related stuff. Until I was in middle school, I wanted to be an Astronaut. This means that I would watch every space- and flying-related show that PBS would put on (<insert shout out to WTTW here>).

Somewhere along the line there was an episode of something that covered the X-15. One of the things included was the time that one of the test airframes blew up on Scott Crossfield during a static engine test. The way I remember it going down, was the engine flamed out, someone said “restart” over the radio, and immediately after that, it exploded. For as long as I can remember, whenever someone says “restart”, my brain fills in an explosion right afterwards. Sometimes this is funny, other times, not so much.

That’s where the <BOOM> is from.

While thinking about putting this story down on paper, I realized that due to the awesomeness of the Internet, there might be some tape of this incident on YouTube. Guess what? There is! Awesome right? Run it up to the one minute mark for the juicy bit:

(link if embed fails: http://www.youtube.com/watch?v=WXpEPZ6ZZIs)

Turns out, it’s a little bit not-so-awesome, because it appears to have been a planned shutdown of the motor, AND, as you can see/hear, the voice on the radio didn’t say restart, they said reset. Obviously this destroys like 20 years of nostalgia, and now I’m lost in life because my “restart <boom>” thing is all wrong.

The Internet giveth, the Internet taketh away (apologies to God).