“The Tuesday Night Fire Code Violation”

It was July 19, 2005. At least, I’m pretty sure it was.

Based on IndyPASS’s meeting history, that second meeting way down at the bottom (use your keyboard’s End key; that’s what it’s there for) was basically a “here’s what’s new/awesome in SQL Server 2005” presentation. I’ve long since lost most of my email from that time, but that meeting makes sense in the timeline of 2005’s release.

During the dark, dark days of 2005, just about everyone was desperate for an upgrade to SQL 2000. I was, and I hadn’t even been here that long. The fledgling Indianapolis PASS chapter met in a good-sized conference room on the ground floor of a Duke-owned office building off Meridian St (“twelve o’clock on the I-465 dial”) on the north side of town. That night, there were probably half-again as many people in that room as it could comfortably hold. People standing, sitting on the floor, you name it. Tom Pizzato, the speaker, was introduced; he walked up to the podium and the first thing he said was, “Welcome to the Tuesday night fire code violation.” That is still the best one-liner to open a technical presentation I’ve ever seen, and ever since, it has been cemented to SQL Server 2005 itself in my brain.

That was a long time ago–It’ll be eleven years here in a couple months. Eleven years is an appreciable percentage of an eternity in the tech world. As a result, earlier this week, Extended Support for SQL 2005 ended. This means that you, if you are still running it anywhere, will get no help from Microsoft were something to go wrong. Perhaps more importantly, there will be no more security patches made available for it. Don’t expect if something big happens, there will be a replay of what Microsoft did for XP.

This is a pretty big deal. If you have any kind of problem that you can’t fix, and you call Microsoft Support about it, you won’t get any help for your in-place system. You will have to upgrade to a supported version before you’ll be able to get any assistance, and in the middle of a problem bad enough to call PSS probably is not the time you want to be doing a Cowboy Upgrade™ of your production database system.

I understand that there are plenty of industries and even some specific companies that are either forced to, or elect to continue to run out-of-support RDBMSes on their mission-critical systems. I supported SQL 2000 for far longer than I would like to admit, and it was a risky proposition. After I transitioned out of that role, there was a restoration problem (fortunately on a non-production system) that it sure would have been nice to be able to call Microsoft about, but that wasn’t an option.

Don’t put yourself in that situation. There are plenty of points that can be made to convince the powers that be to upgrade. The fact that any new security vulnerability will not be addressed/patched should be a pretty good one for most companies. If you have an in-house network security staff, loop them in on the situation; I bet they will be happy to help you make your case.

One final note: If you are still running 2005 and are looking to upgrade, don’t just hop up to 2008 or 2012–go all the way to 2014 (or, once it goes Gold, 2016). SQL Server 2008 and 2008 R2 are scheduled to go off Extended Support on July 9, 2019. Three years seems like a long way off now, but that’ll sneak up on you…just like April 12, 2016 might have.

High Availability in SQL Server Standard Edition (or Semi-Lack Thereof)

SQ Server 2012 brought about some major changes to the various High Availability schemes supported by the product. The most major of these is the introduction of AlwaysOn Availability Groups. As described early in that MSDN article, these can be over-simplified summed up as “enterprise-level database mirroring.” This is not quite the same thing as the existing Failover Clustering (which is still available), although AGs do require and run on a cluster.

From a Business Intelligence perspective, it’s a somewhat different situation: Analysis Services is cluster-aware, so it can be used in a Failover Clustering situation. SSRS has scale-out capabilities, which, if architected with it in mind, can provide some form of redundancy. SSIS has nothing built-in for high availability, which one could expect for an ETL solution (I could go on for a while about why HA ETL is dicey, but that’s not what we’re here for). AlwaysOn AGs don’t exist for any of these products, possibly because what the feature is/does doesn’t make sense for anything except, I would argue, SSAS. I’m mostly not here today to talk about BI HA, but I will come back to it briefly.

2012 ChangeS, Plural

With the introduction of AGs as “beefy mirroring”, it didn’t make sense to continue to support multiple, awfully similar, features. The result is Database Mirroring, introduced in SQL Server 2005, is deprecated as of SQL Server 2012. It’s not in the “Next Version” list, since this is the first time it has appeared, so there are at least two major version releases before it will go away entirely. (With SQL 2014 announced last week at TechEd North America, stay tuned for its documentation release to see if Mirroring has moved closer to death.)

The point is, it will be going away. What to do? Logic would suggest the intended migration path for DB Mirroring users would be to move to AlwaysOn AGs. Sounds like a good enough idea. I mean, since as mentioned, Microsoft themselves describes it as enterprise-grade mirroring, Standard does do two-node clustering, so let’s do that!

When They Said “Enterprise”, They Really Meant It

There is a potential problem with that logic. Specifically if one has been using (or would like to start using) the synchronous-only flavor of DB Mirroring available in the Standard edition of SQL Server, the available options have gotten realllly thin. See, AlwaysOn AGs aren’t available in the Standard Edition of SQL Server; at least not in 2012. This means if a company is running a few mission-critical DBs in a mirroring setup with Standard edition all-around, that setup’s upgrade path is very limited: in order to keep it, they wouldn’t be able to upgrade past whatever future version is the last one that includes Mirroring. For any other company who would like to deploy such a setup in the future, there will be a point in time when they won’t be able to—the feature won’t exist in their desired Edition of SQL Server.

Unless, of course, they want to upgrade to Enterprise. That’s…well…expensive. It always has been, but for most modern hardware, it’s a bigger jump from Standard to Enterprise than it used to be. There are plenty of other reasons worth spending the extra money to upgrade to Enterprise, but just because a system or DB is nosebleed-mission-critical doesn’t mean it’s huge, requiring table partitioning or something to run well. Especially at a small-to-midsize company, HA might be the only Enterprise Edition features needed. Is it worth the money? Wouldn’t it be nice if things stayed closer to how they are now?

What Should it Look Like?

This is the whole point of why I’m here: What do I want the HA situation to look like in Standard Edition?

I do not believe that High Availability options not named “Log Shipping” should be Enterprise-only. At least not entirely. I’m not saying Microsoft should make all four secondaries (eight in 2014) available in Standard. Nor am I 100% convinced that they should be readable in Standard like they are in Enterprise. I think that a single secondary, living on a second instance on the other node of that 2-node cluster allowed in Standard, usable for failover purposes only, would do the trick.

This starts to look similar to the mirroring setup currently available in Standard, and that’s exactly what I’m trying to do. I don’t think we should get everything without having to pay for it—ie, all of the nice fancy stuff in Standard. There are features that 100% should be only available in Enterprise. Full-on readable secondaries, with SSRS reports or SSIS load jobs pointed at them, is one of those things that should require a fatter check to MSFT.

Semi-Related BI Commentary

Since I’m filling out the SQL Server section of my Christmas List, I was going to say it would be nice to have AlwaysOn AGs for SSAS, too. After thinking about that for 15 more seconds, I realized that was dumb, since, due to the nature of SSAS, it would be pretty pointless—we would get the same thing out of some kind of scale-out architecture.

Such an architecture already exists, but I think it is terribly kludgey and almost has to be fragile in practice. So, why not make a “real” scale-out system based on the AG architecture? SSAS is cluster-aware already; just need some kind of thing to automate copying of the freshly-processed data from the Primary (“Data Processing Server” in that article) to the Secondaries (“Data Access Servers”). Add some awareness of this process to the existing AG listener process/service, and boom! I’ve never had to deal with quite that big of an SSAS environment, so this might be a terrible idea, but it sounds good in my head!

Except…I would expect this to be Enterprise Edition-only functionally. Sooo…nevermind.

T-SQL Tuesday #24: Prox ‘n’ Funx

Procedures and Functions. Well, this could be interesting. Everyone else’s posts, that is.

T-SQL Tuesday

#24, brought to us by Brad Schulz

OK, T-SQL Tuesday Twenty-Four: Two-year anniversary edition! Pretty sweet of Adam Machanic (blog | @AdamMachanic) to come up with this whole thing two years ago. A good guy, he is. This month’s topic is Prox ‘n’ Funx, brought to all of us by Brad Schulz (blog). I kind of feel bad here—I don’t really know much about Brad, but he’s an MVP, and flipping around his blog brings up some pretty cool posts. I really like this one, for example.

I actually have what I think is a decent little thing to talk about for this topic. It could have been a stretch topic for T-SQL Tuesday #11 about misconceptions, even if [I hope] not exactly a widespread one. This topic revolves around transaction control and error handling within stored procedures, which can be a crazy topic in and of itself. Books could be written on the topic. Specifically, the conversation that I found myself involved in one day was about what happens to any open explicit transactions when a procedure runs into an error.

Let’s Talk About Procs Dying

Once upon a time, someone said to me that if an error occurs within a procedure which contains an explicit transaction, that transaction will remain open, potentially blocking other sessions on the system. In short, that’s not true, and it’s fairly easy to prove. We’ll work through a quick little script to do this, and will include modern TRY…CATCH error handling, because that will come in when we get to the main point (I say “modern” there like it’s a new thing, but coming from a guy who has worked with SQL 2000 more than anything else, that distinction feels necessary). It actually doesn’t matter if there is any error handling during the first part of this exercise, as the results will be the same. This is a pretty contrived setup, but that’s pretty much what I’m going for.

First, create a table & put a couple of rows into it. The two columns we’re most worried about are “FakeNumberColumn” and “ActualNumberColumn” (note their data types), so named because the error which will be triggered in a little bit will be a type conversion error.
-- Create Table
CREATE TABLE dbo.TransactionTest2008
      
(     TransactionTestID       INT                     NOT NULL    IDENTITY(1,1),
            
RowDescription          VARCHAR(50)             NOT NULL,
            
FakeNumberColumn        VARCHAR(10)             NOT NULL,
            
ActualNumberColumn      INT                     NULL
      )
;

-- Populate it with a couple of rows
INSERT INTO dbo.TransactionTest2008 (RowDescription, FakeNumberColumn)
      
SELECT 'Actually a Number 1', 10
            
UNION ALL
      
SELECT 'Actually a Number 2', 100
;

Now for the really contrived part: a Stored Procedure that will throw an error if FakeNumberColumn contains something that won’t implicitly convert to a numeric:

CREATE PROCEDURE TransactionTester2008
AS

BEGIN try
  
  
BEGIN TRAN
     UPDATE
dbo.TransactionTest2008
          
SET ActualNumberColumn = FakeNumberColumn * 2
     
--   select *
-- from dbo.TransactionTest2008

     -- Wait for 10 seconds so we have a chance to look at DBCC OPENTRAN (unless of course it dies first)
     WAITFOR delay '00:00:10'

   COMMIT TRAN
END
try

BEGIN catch
-- Some kind of error has occured
  
PRINT 'Welcome to Catchville'

   IF @@TRANCOUNT > 0
      
ROLLBACK

   -- Raise an error with the details of the exception
  
DECLARE @ErrMsg NVARCHAR(4000), @ErrSeverity INT
   SELECT
@ErrMsg = ERROR_MESSAGE(),
      
@ErrSeverity = ERROR_SEVERITY()

   RAISERROR(@ErrMsg, @ErrSeverity, 1)
END CATCH


(The commented-out SELECT statement can be un-commented if you like, to see the state of the table at that point of execution.)

As it is now, the proc will run successfully. The 10 second WAITFOR in it gives you time to run DBCC OPENTRAN in another query window to see the proc’s open transaction.

EXEC TransactionTester2008

Open TransactionNow we’ll make things somewhat interesting. Insert another row into our table to put a letter into FakeNumberColumn, then run the proc again.

INSERT INTO dbo.TransactionTest2008 (RowDescription, FakeNumberColumn)
     
SELECT 'Not really a Number', 'F'

EXEC TransactionTester2008

Things won’t go so well this time…

Bombed Proc RunWe get the PRINT message about being in Catchville, so we know that our exception was caught and execution finished in the CATCH block. At this point, go run DBCC OPENTRAN again, and you will see that there isn’t a transaction open. This would be the expected behavior. No transactions are left open; the in-process activities are rolled back.

I should also note that a less-severe error, such as a constraint violation on an INSERT, will only cause an error in that particular statement. The Engine will skip over that statement & continue processing normally. That behavior has led to some near-brown pants moments while running a huge pile of INSERTs, putting in some provided business data, but that’s what explicit transactions are for!

Now, About Timeouts…

OK, that section wound up pretty long. Here’s where I’m actually getting to what I want to talk about…

We’ve established that errors in Stored Procedures will not lead to transactions being left open under normal circumstances. There is a situation where things don’t go so well: when a client/application connection times out for one reason or another. If this happens, the client side will close its end of the connection, and after the in-progress query SQL Server is running completes, nothing else really happens. This can leave open transactions, which, of course, are bad, bad, bad.

Starting with where we left off above, we can simulate an application timeout by cancelling the running SP in Management Studio.

First, delete the error-producing row from dbo.TransactionTest2008:

DELETE FROM dbo.TransactionTest2008
  
WHERE RowDescription = 'Not really a Number'

Execute TransactionTester2008 again, and this time, while in the 10-second WAITFOR, cancel the query in Management Studio. Even with the TRY…CATCH block in place, the explicit transaction is left open (check with DBCC OPENTRAN). What this means is that whatever application (or DBA!) running the statement(s) is responsible for closing an open transaction if a session times out or is cancelled. In my experience, if one is in the habit of wrapping everything you do in explicit BEGIN/COMMIT/ROLLBACK TRAN, they’ll be less likely to cancel a script they’re running and then sit there blocking half of the rest of the world. Not that I’ve been there, or anything…

There is a safety net here: XACT_ABORT. I first learned about XACT_ABORT while doing some work with Linked Servers a few years ago. What XACT_ABORT does when set to ON is to force SQL Server to terminate and roll back the entire batch if any error occurs. Here’s the Books On Line page for XACT_ABORT.

In our case, flipping that setting to ON within our test SP will change what happens when a query timeout (or cancel) happens. Add “SET XACT_ABORT ON” at the begging of the above SP and re-create it thusly (either drop and recreate or add the line and change it to ALTER PROCEDURE):

CREATE PROCEDURE TransactionTester2008
AS

SET XACT_ABORT ON
[…]

Run the SP as before, and again, while in the 10-second WAITFOR, cancel the query. Now if checking for an open transaction, there won’t be one—it was rolled back by the engine when the timeout (cancel) occurred, because of XACT_ABORT. No locks are still being held, no other sessions will be blocked by the timed-out session. Smooth sailing 🙂

As Usual, Be Careful

Before hauling off and adding this to a bunch of SPs, beware the Law of Unintended Consequences. Test extensively, because no matter how self-contained a fix may seem, who knows what else may be affected. This is no where this is as true as it is in old, overly-complicated, poorly-understood legacy systems. I know DBAs like to fix problems and prevent problems from happening in the first place, but please make sure that no new issues are introduced while trying to “fix” things.

Oh; don’t forget to clean up after yourself!

DROP TABLE dbo.TransactionTest2008
DROP PROCEDURE TransactionTester2008

 

A Note About cliconfg.exe on x64 Machines

If you don’t know (and that’s a good thing in this instance), cliconfg.exe is one of those older-than-dirt parts of Windows that is still around because it serves a specific purpose. In this case, it is used to configure SQL Server connection aliases.

Aliases are basically an abstraction between a SQL Server Instance’s actual connection string information (DNS name, Instance name, port, etc) and the name used by a client to connect to it. In this case, “client” can be either code/an application or a user. I know aliases are mostly evil, however some people do use them extensively. I don’t know that they’re all bad, but that’s a different post for a different time.

x86 cliconfg.exe vs x64 cliconfg.exe

Sit down at any remotely-modern Windows machine (it’s been there since 2000) and you can bring up a Run box, type “cliconfg”, press ENTER, and the utility will launch. If you do this on a 64-bit build of Windows, the 64-bit version of the utility will launch, as you might expect. This sounds good at first, but it might not be.

The first problem this presents is there is absolutely zero way to tell from within the utility itself which version of it is running. Therefore, if you’ve just moved from a 32 to 64-bit machine, there isn’t anything that might get your attention and suggest that anything unexpected is going on. This has probably happened a long time ago for some, but if you’re a corporate monkey like I am, you might have only recently been able to make the jump to a 64-bit OS.

The ultimate issue that crops up stems from why there are two versions of cliconfg.exe in the first place. In short, Windows keeps separate lists of aliases, one used by 32-bit applications and the other by 64-bit. These lists are maintained by their respective versions of cliconfg.exe. These lists are completely independent, and are only accessible by applications compiled with the same bit width.

In order to launch the 32-bit version of cliconfg, it has to be run from the folder it’s in. The path to this file is %SystemDrive%\Windows\sysWOW64\cliconfg.exe. As mentioned, it looks (and acts) just like the 64-bit version. I’d think that the folder should be named “sysWOW32” instead, but what do I know?

Which one to use?

Setting up aliases needs to be slightly planned out, although the same aliases can just be set up on both “sides”, and call it done. If one chooses not to do that, it’s a matter of thinking about what apps are going to
be utilizing any given alias, and then setting that alias up on the same bit width as the app. For example, to use an alias from within SQL Server Management Studio, the alias needs to be set up on the 32-bit side, as SSMS is still a 32-bit application.

Plan B (or Plan A?)

Here’s a different idea: Don’t use cliconfg.exe!

It’s old. I have a feeling it will go away some day. It isn’t very clear on what you are doing (32 vs 64-bit). All problems. Instead, consider using the SQL Server Configuration Manager. It’s new(er). It’s part of SQL Server, so it will probably be around for at least a bit. It is also very clear about the difference between 32 and 64-bit aliases.

For the current topic of conversation, that last point is the most important one. On a 64-bit machine, the Config Manager will have two Native Client items in the tree view in the left pane, one for 64-bit and the other for 32. Aliases configured in one do not show up in the other, making it perfectly clear what is going on.

I know that aliases are bad; using cliconfg.exe to manage them is arguably worse. However, if the situation exists that they need to be used, care needs to be taken in 64-bit client environments. The best way to deal with this situation is to use Configuration Manager.

Meme Monday: “I got 99 problems but a disk ain’t one”

Due to Internet Spaceships (aka, Spreadsheets in Space), whenever I hear the “I got 99 problems” line, I always think of a station that we had in a solar system whose name started with “9-9.” The alliance that I flew under had a naming standard of the first 3 characters or so of the station name matched the system name to cut down on confusion and help you confirm that you knew where you were. That station was named “99 problems but <…> ain’t one.” I don’t remember what was in the gap anymore; there’s a decent chance that it wasn’t family friendly anyway.

Anyway, I’m a huge nerd.

On topic…

This time around for what is becoming Tom’s monthly writing exercise, the topic is to make a list of things that can go wrong with SQL Server that are not disk issues. This should be fairly easy for me, since we have more I/O than God, fortunately, but we’ll see how much of a trainwreck I can turn this into.

  • Cowboy Sysadmins. I know this is asking for it. I know that there are plenty of good sysadmins out there who are just as pragmatic as good DBAs are. The problem is that they’re not always the only ones working on a project or active on the support rotation. Things can get changed that shouldn’t be when they shouldn’t be. It happens. How do I know this? Well, see, I used to be a Cowboy Sysadmin. I used to be a pretty strong opponent of ITIL and things like Change Management. Then I got a clue and things were better.
  • Cowboy Developers. Self-explanatory.
  • Linked Servers. I know Tom listed this one in his original post, but it really is a disaster waiting to happen. Long story short, I wound up implementing an LS over a WAN link, and it’s a miracle that worked. There are a lot of moving parts involved in LSes, and when a cluster is involved, it’s even worse. PLUS, when name resolution isn’t working through other means, hostnames need to be added to LMHOST (LMHOSTS, not HOSTS (!)). That says to me there’s some ancient piece of code in MSDTC in use and that scares the crap out of me.
  • Crap code/Design. And by “Crap,” I mean, “Legacy.” Why is old code always the worst? Were the people here before really that dumb, or did they just not know any better? I mean… the storage engine has to do the exact same thing with 9 indexes on a table now that it had to do a decade ago… (no, I’m not kidding).
  • Letting me design your DB. I’m better at this than I was two years ago, but you should still have someone who knows what they’re doing look at it before you do anything else with it.
  • Reporting out of your OLTP system. This may be OK…but it may be very, very bad. I’ve seen some doozies, but sometimes it isn’t completely avoidable. Limit it as much as you can. You don’t even need to go full-blown data warehouse or data mart for this, either; log-shipped or a backup restored on another instance may get the scary queries off your back.
  • Flakey alerting/monitoring system. This doesn’t directly affect your company’s DBs, but if a backup job failed last night and you didn’t get an email alert about it… well, things may not be OK, would they?

Tom was hoping to get nine out of us, but I’m tired, this is due tomorrow, and the President is apparently going to drop a bomb on us in five minutes, so this is what you get.

Tom, this whole thing is kind of fun and it really does give me/us easy fodder to write about, so I appreciate it.