T-SQL Tuesday #24: Prox ‘n’ Funx

Procedures and Functions. Well, this could be interesting. Everyone else’s posts, that is.

T-SQL Tuesday

#24, brought to us by Brad Schulz

OK, T-SQL Tuesday Twenty-Four: Two-year anniversary edition! Pretty sweet of Adam Machanic (blog | @AdamMachanic) to come up with this whole thing two years ago. A good guy, he is. This month’s topic is Prox ‘n’ Funx, brought to all of us by Brad Schulz (blog). I kind of feel bad here—I don’t really know much about Brad, but he’s an MVP, and flipping around his blog brings up some pretty cool posts. I really like this one, for example.

I actually have what I think is a decent little thing to talk about for this topic. It could have been a stretch topic for T-SQL Tuesday #11 about misconceptions, even if [I hope] not exactly a widespread one. This topic revolves around transaction control and error handling within stored procedures, which can be a crazy topic in and of itself. Books could be written on the topic. Specifically, the conversation that I found myself involved in one day was about what happens to any open explicit transactions when a procedure runs into an error.

Let’s Talk About Procs Dying

Once upon a time, someone said to me that if an error occurs within a procedure which contains an explicit transaction, that transaction will remain open, potentially blocking other sessions on the system. In short, that’s not true, and it’s fairly easy to prove. We’ll work through a quick little script to do this, and will include modern TRY…CATCH error handling, because that will come in when we get to the main point (I say “modern” there like it’s a new thing, but coming from a guy who has worked with SQL 2000 more than anything else, that distinction feels necessary). It actually doesn’t matter if there is any error handling during the first part of this exercise, as the results will be the same. This is a pretty contrived setup, but that’s pretty much what I’m going for.

First, create a table & put a couple of rows into it. The two columns we’re most worried about are “FakeNumberColumn” and “ActualNumberColumn” (note their data types), so named because the error which will be triggered in a little bit will be a type conversion error.
-- Create Table
CREATE TABLE dbo.TransactionTest2008
      
(     TransactionTestID       INT                     NOT NULL    IDENTITY(1,1),
            
RowDescription          VARCHAR(50)             NOT NULL,
            
FakeNumberColumn        VARCHAR(10)             NOT NULL,
            
ActualNumberColumn      INT                     NULL
      )
;

-- Populate it with a couple of rows
INSERT INTO dbo.TransactionTest2008 (RowDescription, FakeNumberColumn)
      
SELECT 'Actually a Number 1', 10
            
UNION ALL
      
SELECT 'Actually a Number 2', 100
;

Now for the really contrived part: a Stored Procedure that will throw an error if FakeNumberColumn contains something that won’t implicitly convert to a numeric:

CREATE PROCEDURE TransactionTester2008
AS

BEGIN try
  
  
BEGIN TRAN
     UPDATE
dbo.TransactionTest2008
          
SET ActualNumberColumn = FakeNumberColumn * 2
     
--   select *
-- from dbo.TransactionTest2008

     -- Wait for 10 seconds so we have a chance to look at DBCC OPENTRAN (unless of course it dies first)
     WAITFOR delay '00:00:10'

   COMMIT TRAN
END
try

BEGIN catch
-- Some kind of error has occured
  
PRINT 'Welcome to Catchville'

   IF @@TRANCOUNT > 0
      
ROLLBACK

   -- Raise an error with the details of the exception
  
DECLARE @ErrMsg NVARCHAR(4000), @ErrSeverity INT
   SELECT
@ErrMsg = ERROR_MESSAGE(),
      
@ErrSeverity = ERROR_SEVERITY()

   RAISERROR(@ErrMsg, @ErrSeverity, 1)
END CATCH


(The commented-out SELECT statement can be un-commented if you like, to see the state of the table at that point of execution.)

As it is now, the proc will run successfully. The 10 second WAITFOR in it gives you time to run DBCC OPENTRAN in another query window to see the proc’s open transaction.

EXEC TransactionTester2008

Open TransactionNow we’ll make things somewhat interesting. Insert another row into our table to put a letter into FakeNumberColumn, then run the proc again.

INSERT INTO dbo.TransactionTest2008 (RowDescription, FakeNumberColumn)
     
SELECT 'Not really a Number', 'F'

EXEC TransactionTester2008

Things won’t go so well this time…

Bombed Proc RunWe get the PRINT message about being in Catchville, so we know that our exception was caught and execution finished in the CATCH block. At this point, go run DBCC OPENTRAN again, and you will see that there isn’t a transaction open. This would be the expected behavior. No transactions are left open; the in-process activities are rolled back.

I should also note that a less-severe error, such as a constraint violation on an INSERT, will only cause an error in that particular statement. The Engine will skip over that statement & continue processing normally. That behavior has led to some near-brown pants moments while running a huge pile of INSERTs, putting in some provided business data, but that’s what explicit transactions are for!

Now, About Timeouts…

OK, that section wound up pretty long. Here’s where I’m actually getting to what I want to talk about…

We’ve established that errors in Stored Procedures will not lead to transactions being left open under normal circumstances. There is a situation where things don’t go so well: when a client/application connection times out for one reason or another. If this happens, the client side will close its end of the connection, and after the in-progress query SQL Server is running completes, nothing else really happens. This can leave open transactions, which, of course, are bad, bad, bad.

Starting with where we left off above, we can simulate an application timeout by cancelling the running SP in Management Studio.

First, delete the error-producing row from dbo.TransactionTest2008:

DELETE FROM dbo.TransactionTest2008
  
WHERE RowDescription = 'Not really a Number'

Execute TransactionTester2008 again, and this time, while in the 10-second WAITFOR, cancel the query in Management Studio. Even with the TRY…CATCH block in place, the explicit transaction is left open (check with DBCC OPENTRAN). What this means is that whatever application (or DBA!) running the statement(s) is responsible for closing an open transaction if a session times out or is cancelled. In my experience, if one is in the habit of wrapping everything you do in explicit BEGIN/COMMIT/ROLLBACK TRAN, they’ll be less likely to cancel a script they’re running and then sit there blocking half of the rest of the world. Not that I’ve been there, or anything…

There is a safety net here: XACT_ABORT. I first learned about XACT_ABORT while doing some work with Linked Servers a few years ago. What XACT_ABORT does when set to ON is to force SQL Server to terminate and roll back the entire batch if any error occurs. Here’s the Books On Line page for XACT_ABORT.

In our case, flipping that setting to ON within our test SP will change what happens when a query timeout (or cancel) happens. Add “SET XACT_ABORT ON” at the begging of the above SP and re-create it thusly (either drop and recreate or add the line and change it to ALTER PROCEDURE):

CREATE PROCEDURE TransactionTester2008
AS

SET XACT_ABORT ON
[…]

Run the SP as before, and again, while in the 10-second WAITFOR, cancel the query. Now if checking for an open transaction, there won’t be one—it was rolled back by the engine when the timeout (cancel) occurred, because of XACT_ABORT. No locks are still being held, no other sessions will be blocked by the timed-out session. Smooth sailing 🙂

As Usual, Be Careful

Before hauling off and adding this to a bunch of SPs, beware the Law of Unintended Consequences. Test extensively, because no matter how self-contained a fix may seem, who knows what else may be affected. This is no where this is as true as it is in old, overly-complicated, poorly-understood legacy systems. I know DBAs like to fix problems and prevent problems from happening in the first place, but please make sure that no new issues are introduced while trying to “fix” things.

Oh; don’t forget to clean up after yourself!

DROP TABLE dbo.TransactionTest2008
DROP PROCEDURE TransactionTester2008

 

A Note About cliconfg.exe on x64 Machines

If you don’t know (and that’s a good thing in this instance), cliconfg.exe is one of those older-than-dirt parts of Windows that is still around because it serves a specific purpose. In this case, it is used to configure SQL Server connection aliases.

Aliases are basically an abstraction between a SQL Server Instance’s actual connection string information (DNS name, Instance name, port, etc) and the name used by a client to connect to it. In this case, “client” can be either code/an application or a user. I know aliases are mostly evil, however some people do use them extensively. I don’t know that they’re all bad, but that’s a different post for a different time.

x86 cliconfg.exe vs x64 cliconfg.exe

Sit down at any remotely-modern Windows machine (it’s been there since 2000) and you can bring up a Run box, type “cliconfg”, press ENTER, and the utility will launch. If you do this on a 64-bit build of Windows, the 64-bit version of the utility will launch, as you might expect. This sounds good at first, but it might not be.

The first problem this presents is there is absolutely zero way to tell from within the utility itself which version of it is running. Therefore, if you’ve just moved from a 32 to 64-bit machine, there isn’t anything that might get your attention and suggest that anything unexpected is going on. This has probably happened a long time ago for some, but if you’re a corporate monkey like I am, you might have only recently been able to make the jump to a 64-bit OS.

The ultimate issue that crops up stems from why there are two versions of cliconfg.exe in the first place. In short, Windows keeps separate lists of aliases, one used by 32-bit applications and the other by 64-bit. These lists are maintained by their respective versions of cliconfg.exe. These lists are completely independent, and are only accessible by applications compiled with the same bit width.

In order to launch the 32-bit version of cliconfg, it has to be run from the folder it’s in. The path to this file is %SystemDrive%\Windows\sysWOW64\cliconfg.exe. As mentioned, it looks (and acts) just like the 64-bit version. I’d think that the folder should be named “sysWOW32” instead, but what do I know?

Which one to use?

Setting up aliases needs to be slightly planned out, although the same aliases can just be set up on both “sides”, and call it done. If one chooses not to do that, it’s a matter of thinking about what apps are going to
be utilizing any given alias, and then setting that alias up on the same bit width as the app. For example, to use an alias from within SQL Server Management Studio, the alias needs to be set up on the 32-bit side, as SSMS is still a 32-bit application.

Plan B (or Plan A?)

Here’s a different idea: Don’t use cliconfg.exe!

It’s old. I have a feeling it will go away some day. It isn’t very clear on what you are doing (32 vs 64-bit). All problems. Instead, consider using the SQL Server Configuration Manager. It’s new(er). It’s part of SQL Server, so it will probably be around for at least a bit. It is also very clear about the difference between 32 and 64-bit aliases.

For the current topic of conversation, that last point is the most important one. On a 64-bit machine, the Config Manager will have two Native Client items in the tree view in the left pane, one for 64-bit and the other for 32. Aliases configured in one do not show up in the other, making it perfectly clear what is going on.

I know that aliases are bad; using cliconfg.exe to manage them is arguably worse. However, if the situation exists that they need to be used, care needs to be taken in 64-bit client environments. The best way to deal with this situation is to use Configuration Manager.

Meme Monday: “I got 99 problems but a disk ain’t one”

Due to Internet Spaceships (aka, Spreadsheets in Space), whenever I hear the “I got 99 problems” line, I always think of a station that we had in a solar system whose name started with “9-9.” The alliance that I flew under had a naming standard of the first 3 characters or so of the station name matched the system name to cut down on confusion and help you confirm that you knew where you were. That station was named “99 problems but <…> ain’t one.” I don’t remember what was in the gap anymore; there’s a decent chance that it wasn’t family friendly anyway.

Anyway, I’m a huge nerd.

On topic…

This time around for what is becoming Tom’s monthly writing exercise, the topic is to make a list of things that can go wrong with SQL Server that are not disk issues. This should be fairly easy for me, since we have more I/O than God, fortunately, but we’ll see how much of a trainwreck I can turn this into.

  • Cowboy Sysadmins. I know this is asking for it. I know that there are plenty of good sysadmins out there who are just as pragmatic as good DBAs are. The problem is that they’re not always the only ones working on a project or active on the support rotation. Things can get changed that shouldn’t be when they shouldn’t be. It happens. How do I know this? Well, see, I used to be a Cowboy Sysadmin. I used to be a pretty strong opponent of ITIL and things like Change Management. Then I got a clue and things were better.
  • Cowboy Developers. Self-explanatory.
  • Linked Servers. I know Tom listed this one in his original post, but it really is a disaster waiting to happen. Long story short, I wound up implementing an LS over a WAN link, and it’s a miracle that worked. There are a lot of moving parts involved in LSes, and when a cluster is involved, it’s even worse. PLUS, when name resolution isn’t working through other means, hostnames need to be added to LMHOST (LMHOSTS, not HOSTS (!)). That says to me there’s some ancient piece of code in MSDTC in use and that scares the crap out of me.
  • Crap code/Design. And by “Crap,” I mean, “Legacy.” Why is old code always the worst? Were the people here before really that dumb, or did they just not know any better? I mean… the storage engine has to do the exact same thing with 9 indexes on a table now that it had to do a decade ago… (no, I’m not kidding).
  • Letting me design your DB. I’m better at this than I was two years ago, but you should still have someone who knows what they’re doing look at it before you do anything else with it.
  • Reporting out of your OLTP system. This may be OK…but it may be very, very bad. I’ve seen some doozies, but sometimes it isn’t completely avoidable. Limit it as much as you can. You don’t even need to go full-blown data warehouse or data mart for this, either; log-shipped or a backup restored on another instance may get the scary queries off your back.
  • Flakey alerting/monitoring system. This doesn’t directly affect your company’s DBs, but if a backup job failed last night and you didn’t get an email alert about it… well, things may not be OK, would they?

Tom was hoping to get nine out of us, but I’m tired, this is due tomorrow, and the President is apparently going to drop a bomb on us in five minutes, so this is what you get.

Tom, this whole thing is kind of fun and it really does give me/us easy fodder to write about, so I appreciate it.

PowerShell While Not an Admin: Well, That Wasn’t Any Fun!

Long story short (this backstory is the only thing that’s short about this post), I’m ridiculously behind in my RSS reader, but I’m working through it, as I refuse to Mark as Read en masse. One of the articles that I decided to carefully read through and follow along with is Thomas LaRock’s (blog | @SQLRockstar) SQL University post from Jan 19 (SQL University info).

Since one of my “goals” for this year is to learn PowerShell (if I stay a DBA), I followed the steps to get a feel for things (I’ve used PoSh before, but pretty much only in a copy-and-paste sense), and although the first couple commands worked as advertised, they were accompanied by some errors:

PowerShell WarningsAt first, it seemed a little odd to be getting Access Denied errors while all of the commands I was running were working. The “SQL Server Service” part & mention of WMI made it sound like a Windows-level function that the warnings were about. The commands I was running only dealt with SQL server, which could explain why they were working.

Alright, let’s back up a bit & go over the environment a little bit.

Principle of Least Privilege

The SQL Server I was connecting to is on a different machine on the network, not a local instance. Although my Windows account has SA rights on SQL, it doesn’t have any rights on the box itself (ie, not a local admin). Even though this is just equipment in the house, I have as much set up as I can in accordance with the Principle of Least Privilege. I both think this is a good idea and it gives us a more realistic environment to do testing and experimentation in.

(This whole exercise is actually a good example of why I have things set up this way: If I had gone to do anything with PoSh on, for example, our Accounting server at work, I would have run into this same situation, as the DBAs don’t have Local Admin on those servers. Instead of being surprised by this and having to run down what is going on and how to fix it, I already found it, figured out the problem & now would know what needs to be requested at the office.)

Since my account doesn’t have Admin access to the Server itself, it makes sense that I didn’t have rights for the system’s WMI service, especially remotely. Unfortunately, at the time, I hadn’t thought through the different parts of the system that PoSh was trying to touch, so I generally went in search based on the warning message.

What do WMI & DCOM have to do with Powershell?

WMI (Windows Management Instrumentation) is, put pretty simply, a way to get to administrative-type information about a machine. There are lots of ways to interact with it, and at the end of the day, it can be summed up as a way to programmatically interact with a Windows system. An example of something it can do is, say, return the status of a Windows Service…

DCOM (I had to look this chunk of stuff up) is a communication protocol used for inter-server communications by MS technologies. That’s a pretty stripped-down definition, too, but that’s the general idea.

WMI uses DCOM to handle the calls it makes to a remote machine. Since when we start it up from SSMS, Powershell seems interested in checking the status of the SQL Server service (I tried to find documentation of this, but couldn’t find anything), it makes a remote WMI call to do that. Out of the box, Server (2008 at least), doesn’t grant that right to non-Administrators, which was causing the Warning message I was getting.

Granting Remote WMI/DCOM Access

Interestingly, while searching for a resolution to this, I didn’t find anything that related to Powershell at all, let alone SQL Server specifically. For some reason I got distracted by this and wound up flopping around on a bunch of unhelpful sites and coming around the long way to a solution. However, I think this did help me notice something that I didn’t find mentioned anywhere. I’ll get to that in a second.

Here’s a rundown of what security settings I changed to get this straightened out. I’m not sure how manageable this is, but I’m going to work on that next. I’m fairly certain that everything I do here can be set via GPO, so it will be possible to distribute these changes to multiple servers without completely wanting to stab yourself in the face (and talking your Sysadin into doing it).

One of the first pages I found with useful information was this page from i-programmer.info. Reading that and this MSDN article at basically the same time got things rolling in the right direction.

Step 1: Adjust DCOM security settings

Fire up Component Services; easiest way to do this is Start | Run (Flag-R), and type in/run DCOMCNFG. Navigate to this path: Component Services\Computers\My Computer\DCOM Config\. Find Windows Management and Instrumentation, right-click on it, and choose Properties. On the Security tab, set the first two, “Launch and Activation Permissions” and “Access Permissions” to “Customize.” Click the Edit button in both of these sections and add the AD group that contains the users you want to grant access to (always grant to groups & not individual users). I checked all options in both of these places, as I want to allow remote access to DCOM.

Step 2: Add users to the DCOM Users group

The MSDN article mentioned a couple paragraphs up references granting users rights for various remote DCOM functions. While looking to set this up, I noticed something that no other article I read had mentioned (this is that bit I mentioned earlier), but I think is an important thing to cover, as it is a better way to deal with some of the security settings needed.

To see this, go back to Component Services, scroll back up to “My Computer” at the top and go to Properties again. On the “COM Security” tab, you will find two different sets of permissions that can be controlled. Click on one of the “Edit Limits” buttons (it doesn’t matter which one). The dialog will look something like this:

DCOM Users Group?

Distributed COM Users? ORLY?

A built-in security group for DCOM users? What is this? Well, turns out, it’s exactly what it sounds like it is—it’s a built-in group that already is configured to grant its members the rights needed for remote DCOM access.

That group makes this part of the setup really easy: Simply add the group containing, say, the DBAs, to this built-in group, and they magically have the needed remote DCOM rights. There’s no need to mess with the permissions on the COM Security tab talked about in this section.

Once to this point, I found that when I would run PS commands, I was now getting a different message:

Different Error after DCOM Secuirty set

The same root problem exists (“Could not obtain SQL Server Service information”, but the Access Denied part is gone, now replaced by an invalid WMI namespace message. That was much easier to troubleshoot, as it was only a WMI problem now. My assumption is that it was security-related, and that was the case, but as I found out, since I hadn’t dealt with the details of WMI security, there’s a little quirk that wound up making me burn a lot more time on this than I needed to.

Step 3: Adjust WMI security

The WMI Security settings are easier to get to than the DCOM stuff (at least it seems that way to me). In Server Manager, under Configuration, right below Services, is “WMI Control.” Right-click | Properties, Security Tab, and there ya go.

Now, here’s where some decisions need to be made. Basically, there are two options:

Grant access to everything

Only grant access to the SQL Server stuff.

Of course the easy way is to grant everything, but if these things are being changed in the first place, because a DBA doesn’t have Local Admin on the server, then that probably won’t line up with the in-use security policies.

Instead, the best thing is probably to set security on the SQL Server-specific Namespace. The path to this namespace is Root\Microsoft\SqlServer. Click on that folder, then hit the Security button. A standard Windows security settings dialog will come up. Add the desired user/group (again, probably the DBA group) here & add Execute Method & Remote Enable to the default as that should be the minimum needed. Don’t hit OK yet, however, as this is where the afore-mentioned quirk comes in.

By default, when you add a user & set rights in this WMI security dialog, it adds only for the current namespace—no subnamespaces are included. I got horribly burned by this because I’m used to NTFS permissions defaulting to “this folder & subfolders” when you add new ACLs.

WMI Security Advanced

Advanced WMI Security dialog

To fix this, click Advanced, then select the group that was just added, and click the Edit button. That will open the dialog shown above, at which point the listbox can be changed to “This namespace and subnamespaces.”

With all of that set, running SQL Server PowerShell commands should run successfully & not report any Warnings.

Firewall rules & other notes

I only addressed security-related settings here. In order for all of this to work, there may also be some firewall rules adjusted to allow DCOM traffic in addition to what is needed for PoSh access.

Putting my sysadmin hat on for a little bit, if a DBA came to me asking to change these settings, my response could be, “well that’s fine and all, but you really want me to make these changes on all dozen of our DB servers??” This is where Group Policies come in. I haven’t worked through the GPO settings needed to deploy these settings, but as I mentioned before, I’m pretty sure all of this can be set through them. I will work through that process soon & likewise document it as I go. Stay tuned for that post.

A slightly funny bit about this entire thing is due to the amount of time it took me to get this post together, I haven’t actually done anything related to Powershell! Whoops.

Good luck with Powershell everyone… I’ll get caught up eventually.

File System Rights Needed to Attach a Downgrade DB

This falls into the category of things I probably should have already known, but for some number of reasons (not all my fault!), I didn’t.

Last Thursday night was an honorary Friday night at The Hideaway, since we took last Friday off (I got screwed at work on a day of for New Year’s this year), so I was trying to catch up on some blog posts in my reader. One of them included some code I wanted to run, but I found out that I didn’t have AdventureWorks attached to SQL after the last rebuild of the main server at home. The first step was going to be fixing that, so I moved the files from their old location to my new standard folder and set off to Attach…

Before this story gets too long…

The DB wouldn’t attach to the SQL 2008 instance. I was using my old SQL 2005 version of AdventureWorks, because it was already on the machine and at least for now, I want to keep it around. It was complaining about the files or the DB being set to Read-Only, and it can’t do an upgrade on a Read-Only DB. I needed to fix the DB or the file system permissions.

The DB wasn’t read-only when I killed the server before the rebuild, so that wasn’t it. Had to be NTFS permissions.

Checked those and the group that the SQL Service Account had Modify permissions on the folder where the files were. Checked this a few times to make sure I wasn’t crazy. I tried to search out some help on this, but I was having a hard time getting the search terms right to find anything useful. I finally wound up breaking out ProcessMonitor to help figure out what was going on.

Turns out, SQL Server was using (impersonating?) the Windows account I had connected to the instance with when it was reading the files. That account doesn’t have write access to the folder(s) in question, so any writes it was trying to do were failing. That’ll do it.

I don’t understand why it does this. Nothing is mentioned in BOL about this, and some real quick Googling didn’t bring up anything, either. Since I pretty much work only with SQL 2000 and SQL auth all day right now, I don’t know if this is new in 2008. At my last job where I had 2005, whenever I would do an operation like this, my Windows account would have been Local Admin on the server, so I wouldn’t have run into it.

I did some quick testing and although it of course still impersonates the Windows Auth user when you attach a DB that doesn’t need upgrading, as long as the user you’re logged on with has read rights, you’ll still be in business.

Workaround(s)

Obviously giving write rights to the Windows account you’re logging on to SQL with will fix this, but that doesn’t strike me as a good idea if you don’t have those rights in the first place. I mean, you don’t have those rights for a reason, right (principle of least privilege, etc)? But, if it’s a Windows Auth-only situation, that’s the only way to do it with the account in question.

Another way around it, is to use SQL Auth. This is what I wound up doing, mainly because I wanted to test to see if that works. As mentioned, if the Instance isn’t in Mixed Mode Auth, this isn’t an option. Also if, for whatever reason, creating a new account for this purpose isn’t allowed, then this option also doesn’t help you.

I’m sure this is old news to most everyone else, but it caught me by surprise. Lesson learned!