T-SQL Tuesday #24: Prox ‘n’ Funx

Procedures and Functions. Well, this could be interesting. Everyone else’s posts, that is.

T-SQL Tuesday

#24, brought to us by Brad Schulz

OK, T-SQL Tuesday Twenty-Four: Two-year anniversary edition! Pretty sweet of Adam Machanic (blog | @AdamMachanic) to come up with this whole thing two years ago. A good guy, he is. This month’s topic is Prox ‘n’ Funx, brought to all of us by Brad Schulz (blog). I kind of feel bad here—I don’t really know much about Brad, but he’s an MVP, and flipping around his blog brings up some pretty cool posts. I really like this one, for example.

I actually have what I think is a decent little thing to talk about for this topic. It could have been a stretch topic for T-SQL Tuesday #11 about misconceptions, even if [I hope] not exactly a widespread one. This topic revolves around transaction control and error handling within stored procedures, which can be a crazy topic in and of itself. Books could be written on the topic. Specifically, the conversation that I found myself involved in one day was about what happens to any open explicit transactions when a procedure runs into an error.

Let’s Talk About Procs Dying

Once upon a time, someone said to me that if an error occurs within a procedure which contains an explicit transaction, that transaction will remain open, potentially blocking other sessions on the system. In short, that’s not true, and it’s fairly easy to prove. We’ll work through a quick little script to do this, and will include modern TRY…CATCH error handling, because that will come in when we get to the main point (I say “modern” there like it’s a new thing, but coming from a guy who has worked with SQL 2000 more than anything else, that distinction feels necessary). It actually doesn’t matter if there is any error handling during the first part of this exercise, as the results will be the same. This is a pretty contrived setup, but that’s pretty much what I’m going for.

First, create a table & put a couple of rows into it. The two columns we’re most worried about are “FakeNumberColumn” and “ActualNumberColumn” (note their data types), so named because the error which will be triggered in a little bit will be a type conversion error.
-- Create Table
CREATE TABLE dbo.TransactionTest2008
      
(     TransactionTestID       INT                     NOT NULL    IDENTITY(1,1),
            
RowDescription          VARCHAR(50)             NOT NULL,
            
FakeNumberColumn        VARCHAR(10)             NOT NULL,
            
ActualNumberColumn      INT                     NULL
      )
;

-- Populate it with a couple of rows
INSERT INTO dbo.TransactionTest2008 (RowDescription, FakeNumberColumn)
      
SELECT 'Actually a Number 1', 10
            
UNION ALL
      
SELECT 'Actually a Number 2', 100
;

Now for the really contrived part: a Stored Procedure that will throw an error if FakeNumberColumn contains something that won’t implicitly convert to a numeric:

CREATE PROCEDURE TransactionTester2008
AS

BEGIN try
  
  
BEGIN TRAN
     UPDATE
dbo.TransactionTest2008
          
SET ActualNumberColumn = FakeNumberColumn * 2
     
--   select *
-- from dbo.TransactionTest2008

     -- Wait for 10 seconds so we have a chance to look at DBCC OPENTRAN (unless of course it dies first)
     WAITFOR delay '00:00:10'

   COMMIT TRAN
END
try

BEGIN catch
-- Some kind of error has occured
  
PRINT 'Welcome to Catchville'

   IF @@TRANCOUNT > 0
      
ROLLBACK

   -- Raise an error with the details of the exception
  
DECLARE @ErrMsg NVARCHAR(4000), @ErrSeverity INT
   SELECT
@ErrMsg = ERROR_MESSAGE(),
      
@ErrSeverity = ERROR_SEVERITY()

   RAISERROR(@ErrMsg, @ErrSeverity, 1)
END CATCH


(The commented-out SELECT statement can be un-commented if you like, to see the state of the table at that point of execution.)

As it is now, the proc will run successfully. The 10 second WAITFOR in it gives you time to run DBCC OPENTRAN in another query window to see the proc’s open transaction.

EXEC TransactionTester2008

Open TransactionNow we’ll make things somewhat interesting. Insert another row into our table to put a letter into FakeNumberColumn, then run the proc again.

INSERT INTO dbo.TransactionTest2008 (RowDescription, FakeNumberColumn)
     
SELECT 'Not really a Number', 'F'

EXEC TransactionTester2008

Things won’t go so well this time…

Bombed Proc RunWe get the PRINT message about being in Catchville, so we know that our exception was caught and execution finished in the CATCH block. At this point, go run DBCC OPENTRAN again, and you will see that there isn’t a transaction open. This would be the expected behavior. No transactions are left open; the in-process activities are rolled back.

I should also note that a less-severe error, such as a constraint violation on an INSERT, will only cause an error in that particular statement. The Engine will skip over that statement & continue processing normally. That behavior has led to some near-brown pants moments while running a huge pile of INSERTs, putting in some provided business data, but that’s what explicit transactions are for!

Now, About Timeouts…

OK, that section wound up pretty long. Here’s where I’m actually getting to what I want to talk about…

We’ve established that errors in Stored Procedures will not lead to transactions being left open under normal circumstances. There is a situation where things don’t go so well: when a client/application connection times out for one reason or another. If this happens, the client side will close its end of the connection, and after the in-progress query SQL Server is running completes, nothing else really happens. This can leave open transactions, which, of course, are bad, bad, bad.

Starting with where we left off above, we can simulate an application timeout by cancelling the running SP in Management Studio.

First, delete the error-producing row from dbo.TransactionTest2008:

DELETE FROM dbo.TransactionTest2008
  
WHERE RowDescription = 'Not really a Number'

Execute TransactionTester2008 again, and this time, while in the 10-second WAITFOR, cancel the query in Management Studio. Even with the TRY…CATCH block in place, the explicit transaction is left open (check with DBCC OPENTRAN). What this means is that whatever application (or DBA!) running the statement(s) is responsible for closing an open transaction if a session times out or is cancelled. In my experience, if one is in the habit of wrapping everything you do in explicit BEGIN/COMMIT/ROLLBACK TRAN, they’ll be less likely to cancel a script they’re running and then sit there blocking half of the rest of the world. Not that I’ve been there, or anything…

There is a safety net here: XACT_ABORT. I first learned about XACT_ABORT while doing some work with Linked Servers a few years ago. What XACT_ABORT does when set to ON is to force SQL Server to terminate and roll back the entire batch if any error occurs. Here’s the Books On Line page for XACT_ABORT.

In our case, flipping that setting to ON within our test SP will change what happens when a query timeout (or cancel) happens. Add “SET XACT_ABORT ON” at the begging of the above SP and re-create it thusly (either drop and recreate or add the line and change it to ALTER PROCEDURE):

CREATE PROCEDURE TransactionTester2008
AS

SET XACT_ABORT ON
[…]

Run the SP as before, and again, while in the 10-second WAITFOR, cancel the query. Now if checking for an open transaction, there won’t be one—it was rolled back by the engine when the timeout (cancel) occurred, because of XACT_ABORT. No locks are still being held, no other sessions will be blocked by the timed-out session. Smooth sailing 🙂

As Usual, Be Careful

Before hauling off and adding this to a bunch of SPs, beware the Law of Unintended Consequences. Test extensively, because no matter how self-contained a fix may seem, who knows what else may be affected. This is no where this is as true as it is in old, overly-complicated, poorly-understood legacy systems. I know DBAs like to fix problems and prevent problems from happening in the first place, but please make sure that no new issues are introduced while trying to “fix” things.

Oh; don’t forget to clean up after yourself!

DROP TABLE dbo.TransactionTest2008
DROP PROCEDURE TransactionTester2008

 

T-SQL Tuesday #22: Data Presentation

TSQL Tuesday Logo

Robert Pearl hosts No.22

It seems like it hasn’t been that long since last month’s T-SQL Tuesday post; I suppose time flies when you’re having fun and trying to finish up the same ETL project you’ve been working on since March.

This month’s SQL blog party is being hosted by Robert Pearl (blog| @PearlKnows), on the topic of “Data Presentation.” This is a good topic for me at this point, as I’ve all but finished my transition from DBA to BI Monkey (that’s something else I need to write about…). I think Robert is looking for specific examples of ways to present data, but since, as usual, I don’t have anything specific that I can actually publish, I’m left to speak generally about the topic.

Data Presentation: Just as Important as the Data Itself

In a previous life, I was responsible for almost everything data-related for the systems that we ran. As a result, I would get a lot of requests for data. One of my favorite requests would come in the form of, “can you give me some numbers for <X system>?” I would try to keep my response at least marginally non-snarky, but it would generally include two questions:

  1. What, exact, “numbers” do you want? (this is especially where I would have snark problems)
  2. What do you want the data to look like?

Of course, the first one is an important question—if the requestor cannot articulate what it is they actually want (or even what question they’re trying to answer), little else is going to matter. I’ll not dwell on this particular item too much, but suffice to say, sometimes getting a good answer to this seemingly easy question is anything but. I’ve basically come to the conclusion that this is normal.

Once over that hurdle, the conversation can move on to the presentation of whatever data/”numbers” it is the requestor wants. There are almost as many options for presenting data as there are for way to write the T-SQL to retrieve it. Just like writing the SQL in a way that is performance- and resource-conscious, care should be taken when working on the presentation design. It is imperative for the data to be presented in such a way that is understandable and digestible by its intended audience.

Notice I didn’t say “digestible by the party asking for it.” Don’t forget that the request originator may not be the party who is ultimately going to be parsing the provided data. If the audience is not clear in the original request, add a third question to the two that I have listed above: “Who is going to be acting on this data?”

Options for What Happens Next

When the “What do you want it to look like” question is asked, chances are decent that you’ve an idea about what the answer is going to be. If this is a one-off, ad-hoc request, Excel is a popular option. Alternatively, if a robust reporting system is in place, or this request will be a recurring one, developing a report to present the data might be a stronger choice. There are of course other options: the data could be destined for a statistical analysis application, where a CSV file would be more suitable. I would consider this an outlier, though—most of the time, data is prepared for direct human consumption.

Excel is such a popular option that you could almost call it Data’s Universal Distribution Engine (DUDE). Sending data over in Excel is less about the “make it pretty” side of good presentation as it is the “make it useful” side. I’ve found that Excel is a choice a lot of the time because the requestor wants to do more manipulation of the data once they get it. I’ll leave whether or not that is a good thing to the side; the truth is, such activity happens all the time. As a result, when preparing data for an Excel sheet, I like to have an idea of what the user is going to do with it. This sometimes helps to determine what data the user is looking for (if they don’t have a clear idea) but also can help with some formatting or “extras” to include. These “extras” could take the form of running subtotals, percent changes for Year over Year situations, or anything else that is easier to add via SQL instead of someone having to putz around in Excel.

Writing a report to present data has a different set of opportunities than pasting data into Excel. One of the things that I like to see in a solid reporting environment is a set of standards that apply to the reports themselves. Things like common header contents (report name, date/time stamp, name of the data source/DB the data is from, etc), standard text formatting, a common set of descriptors, etc, etc. In addition to making individual reports easier to read & feel more familiar, it can make it easier to compare data etween reports the hard way (one on each monitor), if one has to.

It's only worth 1,000 words if the first ones that come to mind are work safe

One thing each of these two tools gives you is the ability to present data in the form of pretty pictures. There’s a time and a place for everything, but the old cliché, “a picture is worth a thousand words” can/does apply. Sometimes it’s just flat-out hard to beat a good trendline. I have a much easier time seeing even the simplest of trends when data’s plotted out in a histogram. Conversely, one of my coworkers can look at a pile of numbers, not even sorted chronologically, and tell you what is going on in about three seconds.

Knowing where to put your effort goes back to knowing who your intended audience is. Likewise, knowing when to say “no” to visualization is a terribly useful skill. Every data element on the chart should be discernable, or else it doesn’t convey the information it is supposed to, and now the visualization is working against itself. The pie chart to the right? Don’t do that.

Summary

That’s about all I’ve got. In short: Presentation is important. Unfortunately, it can also be complicated. It’s important to ask questions early on in the process and to know your audience. Standardize if you can; help out a little with the complicated work if it can be done in SQL. Also, add visual representations without going overboard. I’ve always found turning “data” into “information” for people to be fun; if it can make someone else’s job easier/more fun, too, then all for the better.

T-SQL Tuesday #21: “This Ugly Hack is Only Temporary”

…Unless “nothing is temporary around here” also applies to your shop. Then…well…good luck with that (I feel your pain).

Wednesday is better than Tuesday anyway; everyone knows that

It’s been a while since I’ve participated in a T-SQL Tuesday; haven’t been writing at all, really, as you can see. Summer apparently does that to me. If you’re reading this, then it means I got it shoehorned in.

T-SQL Tuesday is the brainchild of Adam Machanic (blog | @AdamMachanic) that started, apparently, 21 months ago. Its purpose is for bloggers to get together and cover a particular topic every month. This is good for readers and also is an easy way to get a blog topic, if one has a hard time coming up with them (like me). This month’s topic is hosted by Adam, and has been billed as “revealing your crap to the world.” There will be lots of good stuff to read this month, calling it now.

RI Enforced by aTrigger? Check.

I’m going to preface this one by saying that it never actually saw the light of day. Sometime between Design & Development, the project that it was part of got canned, so the Development instance was the only place this was ever deployed to. But, it was still crap, and it was crap that I came up with, at that, so here we go.

There once was a big ugly “Location” table. This table was way wider than it should have been, mainly due to the fact that “normalization” was a term missing from the historical evolution of the system it was a part of. As you can imagine, this table contained a lot of character columns of varying widths (up to and including TEXT). To make matters worse, in lots of cases, the UI allows free text entry, which makes much of the contents suspect, at best.

A project came along that basically would require normalizing out a couple of columns related to a certain location attribute. For the sake of discussion, this could be just about anything—we’ll use “Preferred Shipping Carrier” as an example.

At the beginning of design, this “PreferredShipCarrier” column is full of shady entries. There are some “UPS”es, “FedEx”es, and “None”s, but there’s also a lot of NULLs (which shouldn’t really happen), “N/A”s, and “UPPS”es, all of which are incorrect for one reason or another. Obviously as part of normalizing this column out, there would need  to be some corresponding data cleanup, but that’s beside the point.

Where things go south is how this was to be implemented. Since the column in the table is VARCHAR, a real Foreign Key relationship can’t be set up to the INT Surrogate Key in the new “PreferredShipper” table. This is the first sign of trouble. Second sign is that we can’t change the UI—it has to stay in its free-text entry state. This leaves the database exposed to bad data, as there can’t be RI to enforce valid entries.

Enter a Trigger. This can be used to, upon an attempted row insert into the Location table, look up the value entered for Location.PreferredShipCarrier, and throw an error if it doesn’t find a matching row. I hated it, but it got the job done.

Auditing with Triggers, too? What is up with you and Triggers?

SQL 2000 was terrible for auditing—it didn’t have all of the new fancy stuff that we have now for this sort of thing. That means, when you’re dealing with 2000, and the business wants an audit trail of who changed what when, your options are/were pretty limited.

At one point, a requirement came along to implement audit trails on some tables. This means I needed to duplicate tables, adding some metadata columns to them (audit timestamp, who dunnit, what the action was, etc), and then put a trigger on each of the source tables. Some of these particular tables were busy, so the triggers & log tables got a lot of action and I hated having to do this to get what was needed.

If the trigger was fired for a DELETE, it would log the row that was deleted. If an UPDATE was happening, it would first check to see if any column contents were actually changing, and if so, log the previous state of the row to the log table. These triggers will grow relative to the width of the table (because they contain three explicit lists of the tables’ columns), so if the tables being audited are wide, the trigger will get pretty long. Additionally, since triggers in SQL Server are set-based operations, running on all rows that are being DELETEd or INSERTed, special care needs to be taken so they can operate when more than one row is operated on. This can make them a bit ticklish to write.

I know this isn’t necessarily crap, as when you’ve got to audit, you’ve got to audit. I don’t like it if for no other reason than the extra clutter it puts in to the DB and just the general idea of triggers all over the place. All manner of things can happen in triggers, and if you are operating in an unfamiliar DB or troubleshooting a goofy problem, things could be going on in there behind the curtain making troubleshooting harder.

In short: Friends don’t let friends mash F5 on CREATE TRIGGER.

T-SQL Tuesday #15: Automation in SQL Server

Automation: every lazy DBA’s best friend; in some situations, a ticket to sanity.

T-SQL Tuesday #15

T-SQL Tuesday #15: Automation is the way to a DBA's heart

This month’s T-SQL Tuesday is brought to you by Pat Wright (blog | @SqlAsylum). The 15th topic for the monthly blog party is, as has been mentioned, Automation in SQL Server. I’m pretty excited to read this month’s posts to see what kind of crazy things everyone does. Most of these posts are probably going to be big on example scripts and code samples, as one would expect for such a topic. This one–for better or worse–won’t.

I have to admit that I haven’t done a lot of from-scratch automation in my day. Lots of things on the list at work right now, but implementation is still pending. As a result, I was afraid that I wouldn’t have anything to talk about this month, but I thought of a goofy direction that I can take this in.

Like many of us who do it for real now, I started my path to DBA-ness (uh… that’s unfortunate) as an Accidental DBA. Since I didn’t know any better at the time, I did a lot (OK, pretty much all) of administrative DBA tasks with the UI. Need to back up a database? Right-click | Tasks | Backup! Want to create an index? Fire up DTA! Use the GUI to pick the columns. Need to do that for more than one DB? Guess you’re going to be there for a while.

This technique obviously gets the job done (for the most part), but there is a lot of room for improvement.

“Automation” doesn’t have to be fancy

If you’re using the GUI for a lot of tasks, there’s an easy & cheap way to “automate” a lot of what you’re doing. Simply: Script stuff out. One doesn’t need to be a master of T-SQL syntax to start doing this, either. With SQL 2005 and above, making the transition from GUI to scripted tasks is pretty easy.

SSMS Backup Dialog's Script menu

The Backup Database dialog's "Script" menu

Just about every dialog box in SSMS has a “Script” button at the top. This control will script out whatever changes have been made in the dialog box. For example, if you bring up the Backup Database dialog, fill out the options & destination file location as desired, and then use the Script button to output that to a new Query window, you will wind up with a complete, functional BACKUP DATABASE command with all of the same settings that were selected in the GUI window. Mash F5 on that puppy and you’ll have your backup, just like you wanted it.

How does this classify as automation?

Spirit of the law, folks, spirit of the law 😉

Alright, I admit I might be stretching it a little bit here. I also know that just because two techniques solve the same set of problems doesn’t mean they can be classified the same way.

That said, consider some of the reasons that you automate big tasks:

  • Ease of consistent repeatability
  • Removal of the human element
  • Speed
  • Autonomy if you’re out of the office and someone is filling in

These same things make running T-SQL scripts instead of using the GUI for tasks a better idea:

  • As long as you don’t change the script before you re-run it, the same thing will happen repeatedly (unless of course the script does something like add a particular column to a table a second time). This is especially important when doing things such as migrating a new table through Dev, Test, Stage, and Production over the lifecycle of a project.
  • Setting options in a GUI window is prone to mis-clicks or flat-out forgetting to change a setting from the default.
  • The script is ready to go—running the action is as fast as opening the script file, checking it to make sure it is the one you’re expecting it to be, and mashing F5. This makes implementing the change a fast process, instead of having to click a bunch of radio buttons/checkboxes/whathaveyou, then verifying all of the settings before hitting OK.
  • If you’re out of the office, but something still needs to be deployed, it’s easy for the fill-in DBA (the boss?) to grab the scripts that have been prepared and run them. This is easier than walking through a list of checkboxes to check on a UI screen OR you try to remember everything from memory if the correct settings haven’t been written down.

Considering how I grew into a DBA, making this leap from pointing and clicking for just about everything to typing out ALTER TABLE instead, took some work. In the end, scripting everything is better in pretty much every conceivable way, even if it is hard at first.

If you’re a little GUI-heavy still and like the idea of automating the work that you do, letting go of the UI and embracing the big, blank T-SQL canvas is Step 1. The effort will be worth it, and you’ll feel like you’ve really automated tasks.

T-SQL Tuesday #14: Resolutions

T-SQL Tuesday 14

T-SQL Tuesday 14, hosted by Jen McCown

Resolutions… For a long time, I thought 1600 x 1200 to be the ideal, but with the proliferation of widescreen—oh…wait…not that kind of Resolution? Sorry. My bad.

It’s that time again: the monthly blog party known as T-SQL Tuesday. This month is #14, hosted by the deservedly newly-minted Microsoft MVP Jen McCown (blog | @MidnightDBA). The topic this time around is “Resolutions,” originally, “techie resolutions have you been pondering, and why.”

In my 2011 “Goals” post, I talked about one of the things I need to do this year: something about my career direction. I’ll take this opportunity to elaborate a little bit on what that means for me this year. The part about needing to pick one of two directions still applies, but the work that I need to do for each of those is pretty different.

If I stay a DBA…

If I stay a DBA, what I need to work on the most can be summed up fairly simply: Catch up.

Since I work with SQL 2000 almost exclusively these days, there are a lot of current bits of technology that I need to spend time working on and learning:

  • DMVs
  • Policy-Based Management
  • T-SQL advancements (Actually using Try/Catch, TVPs, etc)
  • PowerShell

That’s just a few things that I can pull off the top of my head; I know there are more that I should know and be able to use at this point. Not having the opportunity to use these features day-to-day doesn’t exactly help me learn and keep these skills sharp 😉

If I want to be one of the cool kid DBAs on Twitter, 2011 is a year in which I have to do a lot of work on these topics.

Less BI-Curious and more BI-…uhh…Pro?

The other choice I have is to do awesome Business Intelligence work. I’m still turned on by a lot of these technologies and the power that they can put into the hands of business users at all levels of an organization. A change in direction at my current job to a more BI-focused role has yet to fully materialize, and as for right now I’m still somewhat impatiently watching that carrot out there.

Whether I do it in my current position or choose to take something new, changing direction to full-time BI work comes with its own set of topics that will need lots of attention from me:

  • Improve dimensional modeling skills (I do some of this already)
  • Actually learn how to do work with SSAS
  • PowerPivot
  • Get better at talking to non-geeks!

There are lots more topics than those three here, too. However, BI in and of itself is a pretty wide field, so the specific topics that would need attention would be dictated by the flavor of BI work that I was doing and my involvement on such a team.

Actually making this stuff happen

Like all resolutions, this is, of course, the hard part. Making this happen isn’t going to be easy for me, either, but the rewards should be very well worth the effort required. Unfortunately, since knowing is half the battle, and I don’t have that taken care of yet, the near-term has the potential to be pretty rough.

At the absolute bare minimum, I resolve to do my best to take care of “knowing” in the first quarter of the year.

(Also, I’m sorry about that screen resolution thing at the beginning; couldn’t help myself)