Which is the performance killer: SELECT INTO or INSERT INTO?

There are many ways to kill performance in a script or stored procedure.  However, not many think about the possibility that adding columns to a temporary table can kill performance.  Here, I’m going to show you how a simple design decision – using INSERT vs ALTER TABLE – can have a huge impact on your procedure performance.

This comes from a procedure we wrote for Minion Reindex to greatly improve your index maintenance by allowing you to have some impressive dynamic routines without any extra jobs.

INSERT vs ALTER TABLE – SELECT INTO and ALTER TABLE

We recently wrote a helper procedure to allow you to limit the rows for your reindex runs with Minion Reindex.  We got it working just like we wanted it, and it returned in just a couple seconds.  The problem is that sometimes it would take a couple seconds, and others it would take a couple minutes.  Then it started always taking two minutes, and we weren’t able to get the performance back down at all.

We added indexes, we ran with recompile, we limited our result sets, etc.  We did everything you usually do to fix the performance of a stored procedure.  Then I noticed that it was loading data with a SELECT INTO instead of INSERT/SELECT.  What the code actually did was, it created a table with some rows and then added a couple columns and then updated those rows.

Here’s a look at the actual code:

So the synopsis is: Select Into, add two columns, update those two columns.  That’s it.  And most of the time was spent on the alter table statements.

I’ll go ahead and say that each of these tables only has about 13,000 rows in them, so it’s not like we’re dealing with millions of rows here.  This is the part of the procedure that was taking so long and what needed to be corrected.

INSERT vs ALTER TABLE – INSERT

Now that we’ve seen the original poor-performing code, let’s take a look at the fixed code.

Here we’ve replaced the SELECT INTO with an explicit temp table creation and an INSERT.  This allows us to name the columns we want to insert, saving the two columns at the end to be updated afterward.  See, we’re creating the table with those two columns that were added in the previous code so we can bypass that ALTER TABLE step from above.

 

INSERT vs ALTER TABLE – Bulk Inserts

You may be wondering why a full-blown bulk insert in the form of a SELECT INTO isn’t faster than a logged INSERT.  Well the answer is two-fold.

First, it’s not really the SELECT INTO that was slow, but the ALTER TABLE statements, so that part remains solid.

Second, I am getting my bulk insert from my INSERT statement because not only is the database in simple mode, but you’ll notice I’ve also turned on trace flag 610 (gives me the ability to do bulk inserts when there’s a clustered index).  Now, there’s not currently a clustered index on that temp table, but I don’t want any mistakes in the future.

You may find it hard to believe, but you actually can’t control bulk insert behavior anyway.  I’ll write another blog on that, with a nice discussion on bulk inserts.

 

Download our FREE maintenance modules:

  • Minion Backup: Free SQL Server Backup utility with world-class enterprise features and full lifecycle management.
  • Minion Reindex: Free SQL Server Reindex utility with unmatched configurability.
  • Minion CheckDB: Free SQL Server CHECKDB that solves all of your biggest CHECKDB problems.

I’m the new DBA, and I’m locking down development servers

As the new DBA, I’m here to fix things for you. I’m going to take care of the things that keep causing fires and outages and slowdowns….and I’m locking down your dev environment. Here’s why.

 

Many times, I’ve started with a company as one of their first, if not the first DBA. They’ve acquired enough servers now, with enough data moving around, that they just can’t continue as they are.

There’s no shortage of developers to can write stored procedures or SSIS packages, or to write a new page for the website, but things just seem to be going further and further downhill. Pages take longer to load, SSIS packages take longer to run, production issues spring up all over the place, and so on. Above all, they want the environment locked down. Way too many people have way too many permissions to the databases… and that’s one of the main reasons for the problems.

So, they decide to bring in a DBA to help them sort it all out.

The DBA enters the scene

Ok, here I am. I’m the new DBA and things are exactly what they said they were. I start with the first task: limiting permissions. I mean, if you’re bleeding to death, you want to prevent new cuts, right? It’s a good place to start.

Start by asking all the teams what permissions they need to do their jobs. What a surprise! Everyone who response, reports that they need sysadmin! Weird, right?  There’s also a whole bunch of people that don’t even reply.

Next, I go to each one of them and ask what they actually do in the database. I get varied answers, but most of them amount to 1, running some kind of query to look up customer data, or 2, to run a report for something.  I manage to hold back my rant about the huge disparity between sysadmin and read permissions, quietly decide on read permissions for them, and move on.

Then I come to a development system. I find the main developer on that box and ask what it is that he does. He currently has sysadmin, and he doesn’t need that. The dev starts to get a little irate. He tells me that it’s his box and he needs sysadmin because who else would do all the stuff he does on it? Well, I will. That’s what DBA means. “But it’s a dev box,” he says. “I can’t see any reason why I can’t have sysadmin on a dev box. I understand why you took it away from me in production, but in dev? Give me one good reason!”

The road to giving up sysadmin

Ahhh, one good reason. Ok, here we go then. Now is my opportunity to take away his excuses to have sysadmin. So, I start with a short series of questions.

What happens to the database when you back it up?

Nothing, it gets backed up.

I notice you’re backing up to the same drive the DB is on. What happens if that disk goes down?

We’ll restore from backup.

How? The backup disk is down.

Oh yeah.

Now, even if the disk isn’t down, what happens if the backups fill up the disk and you can’t insert data you need for a unit test?

I have to free the space.

And what happens if someone deletes one of your tables your developing against? Do you have a local copy so you can easily re-create it on the server?

Sometimes, but mostly I create them directly on the server.

So what kind of pace do you keep here? Are you allowed to take as long as you want to dev something, or is it pretty fast-paced?

Oh no, we’re slammed all the time with bug fixes and new features. It’s constant.

And if someone runs a big query in the middle of you writing an SP, and the performance is so bad you have to stop and reboot the server? What about that? Have you planned for that in your daily work cycle?

No, that happens sometimes and I usually lose work. I don’t have time for that.

I’m your sysadmin now, because…

Right, so what you’re looking at here is a production server. This box, though you use it for development, is a production development server. Sure, it’s not customer-facing, but that doesn’t mean it’s not production. If any of those things happen, your current dev effort will be screwed. You may have taken two days to get something just right, only to have to start over. You don’t have time for those mistakes. Nobody does!

So no, you can’t have sysadmin on your own dev box, because it’s actually a production box.

I’m also taking over the management of the backups, so we can actually have something that’ll restore when we need it.

I’m going to put you on a schedule so everyone knows what can and can’t be restored.

I’m going to setup reindexing routines so your queries will hopefully keep going fast. There’s this fabulous set of tools called Minion that I’m going to use. They make this kind of thing a breeze.

So anyway, leave everything to me. You’ll be able to create objects in the database, but you won’t be able to drop anything. Let’s see how that goes, and then we’ll look at whether we’ll have to put you into resource groups to keep from maxing out the server and causing people to lose work.

Denouement

So you see, the moral of this story is that just because you’ve had sysadmin since you got there, and just because the box isn’t customer-facing, doesn’t mean it isn’t production. If you’re on a tight dev cycle and you can’t afford to lose work, the more you need permissions locked down…even on your own dev box.

 

Avoid this sp_executesql issue

If you don’t use this feature very often, you can come across some unexpected sp_executeSQL issues. This time we’ll look at an sp_excuteSQL output issue.

If you don’t use this feature very often, you can come across some unexpected sp_executeSQL issues.  In fact, I’ll probably use this blog the most because I don’t use it that often and I have to look up the specifics every time.

Let’s start with a code sample so we can see what happens when we run it.

Setting up the sp_executeSQL issue

Above you’ll see some really simple sp_executeSQL syntax.  All it does is to parse out the YEAR from the current date.  And you’ll see that I’m taking that result and putting it into a variable called @DynamicResult.  Finally in line 6 I select the variable to see my value.  But the results aren’t what I expect.  Take a look at the results and you’ll see what I mean:

sp_executesql issueThe problem here is clear; I’m getting two resultsets! The first one has the value I’m after, and the other one is NULL.  And looking at it, the NULL value is the one that’s supposed to be the variable.  So how do I get the value into my variable, and get this down to a single result set?  The answer is that you have to treat it like it’s not inside the sp_execute at all.

Understanding the sp_executeSQL issue

Here’s what I mean.  Let’s assume you’re not running this command in sp_executeSQL.  Let’s assume you just want to run it as a regular query like this:

Now this is something you’re a bit more familiar with.  You declare the variable, then you set it equal to the result of your function call.  Then you select it.  We have no problem filling our variable when we do it this way.  And that’s the way you have to do it inside your dynamic code to avoid this sp_executesql issue.  And just for completeness, here’s the result:
spexecuteresults3

Fixing the sp_executeSQL issue

It’s easy now to see what needs to be done.  All we have to do is make sure that we use that same syntax inside the procedure call and we’ll avoid this sp_executesql issue.  Take a look at the fix and you’ll see that it’s exactly like the code above where we just ran the straight statement.

If you didn’t notice, the difference is in the dynamic query itself.

Okay, that’s it.  All you have to do is make sure that you output that exact variable and you’re golden.  You can see that the variable I’m setting inside the dynamic SQL string is the same one I’ve got marked as OUTPUT.

These types of features in SQL are very handy but they all have pitfalls and nuances.  And learning how and when to use them can be tricky because the solution in one case won’t be the solution in another case.

Download our FREE maintenance modules below:

  • Minion Backup: Free SQL Server Backup utility with world-class enterprise features and full lifecycle management.
  • Minion Reindex: Free SQL Server Reindex utility with unmatched configurability.
  • Minion CheckDB: Free SQL Server CHECKDB that solves all of your biggest CHECKDB problems.

 

Using SQL Server OUTPUT Parameters

Let’s talk about using SQL Server OUTPUT parameters!

Last time we talked about where OUTPUT parameters are useful.  This time, let’s talk about using SQL Server OUTPUT parameters.  So let’s get started.

OUTPUT parameter syntax

For starters, a lot of people hate the OUTPUT parameter syntax.  They think it’s bulky and too much trouble.  Honestly, I used to think that too, only now I’ve got enough experience with them that it’s not that bad.  Or maybe I’ve been beat down by it so much that I don’t even notice anymore.  Well, whichever it is, here’s the syntax for using SQL Server OUTPUT parameters.

In the above code I’m creating a stored procedure named SP1.  That procedure has a single parameter that’s declared as an OUTPUT parameter.  This much you know from the last post.  Now, starting on line 16, I declare the variable that’s going to pick up the value.  I’ve given it a name that lets you know exactly what it’s going to be doing.

Here’s a screenshot of the code if you’d prefer:
OUTPUT parameters

OUTPUT Parameters have Backwards Syntax

And here’s where everybody’s hate for this topic kicks in.  I call the procedure in line 22 and I assign the variable to the OUTPUT parameter.  However, let’s take a step back and see why this is so strange to everybody.  When you normally assign a value to a variable, it usually goes left to right like this:

So you assign variables to values like in the code above… from left to right.  However, in the original code block, it’s backwards.  The value for @NameBackupFromSP comes from the left.  And this is essential to understanding how to use OUTPUT parameters.  When you assign them, the assignment is backwards from what you do in every other language.  This is a by-product of mixing methods and here, the stored procedure syntax had to win.  Here’s what I mean.  When you pass in values to stored procedures, you always assign the values from right to left, like you would expect.

So the above is what you would expect to see when doing regular assignment to a stored procedure.  And that’s what I mean when I said above that the stored procedure syntax had to win.  No matter what, when calling a stored procedure with parameters, you have to use this syntax (assuming you’re naming your variables, that is).  So when you go to use OUTPUT parameters, the same stored procedure syntax holds true.  Only now you’re signifying that you also want to get that value back out as well.

So we assign the @Name parameter the same way we do if it were a regular assignment, only now we use the OUTPUT keyword to tell SQL that we want to also get that value back out.  And of course, nothing is going to happen visibly unless we select the variable so we can see it.

That’s the basics of how to call a stored procedure with OUTPUT parameters.  Next time we’ll get into more details of how to use this feature.

Article 1: Learn SQL Server OUTPUT Parameters

Download our FREE maintenance modules:

  • Minion Backup: Free SQL Server Backup utility with world-class enterprise features and full lifecycle management.
  • Minion Reindex: Free SQL Server Reindex utility with unmatched configurability.
  • Minion CheckDB: Free SQL Server CHECKDB that solves all of your biggest CHECKDB problems.

Learn SQL Server OUTPUT Parameters

There are plenty of sites out there that show the syntax of using SQL Server OUTPUT parameters in your SPs, but not many take the time to explain the syntax in detail, or even why you should care. Here I’m going to show you why OUTPUT parameters are a great way to achieve your programming goals.

There are plenty of sites out there that show the syntax of using SQL Server OUTPUT parameters in your SPs, but not many take the time to explain the syntax in detail, or even why you should care.  Here I’m going to show you why OUTPUT parameters are a great way to achieve your programming goals.  This is going to be a multi-part series.

SQL Server OUTPUT Parameters: The Why

There are a couple reasons why you’d want to use OUTPUT parameters:

  1. You need to get a couple solid values out of the procedure.  This is a common requirement and something that a lot of people use functions for.  And let’s face it, functions are very handy for giving you scalar values.  However, functions also have some pretty strict guidelines; you can’t do anything that generates log activity, for example.  So in these cases, where you need scalar data, but a function won’t do, you can switch to a stored procedure with OUTPUT parameters.
  2. You’re trying to pass data from one procedure to another, but you’re running into the ‘nested INSERT…EXEC’ issue.  This issue happens when you have an INSERT…EXEC in say SP1, and then you call SP2 from SP1, but SP2 also contains an INSERT…EXEC.  SQL Server won’t allow this and believe me when I say it’s a huge pain because there are some things you simply can’t reasonable do without INSERT…EXEC.  So, so avoid this nesting of INSERT…EXEC, you can use OUTPUT parameters.

SQL Server OUTPUT Parameters: The How

Creating SQL Server OUTPUT parameters is easy, but the syntax isn’t intuitive at all.  So I’m going to talk you through the syntax of creating a procedure with an OUTPUT parameter, and then in the next post we’ll talk about how to call that procedure and get the values we need.

So again, SQL Server OUTPUT parameters are basically used in lieu of a function when either the function restrictions are too great, or when you just need better performance.

Therefore, to create a stored procedure with an OUTPUT parameter, you simply need to define it as OUTPUT in the parameter section like this:

CREATE PROCEDURE SP1
(
Name varchar(20),
Zip varchar(10),
Phone varchar(20) OUTPUT
)

AS

You can see that we have 3 INPUT parameters, and even though the OUTPUT is the only one that’s actually marked, the others are definitely INPUT parameters.  The default parameter direction is INPUT, so it doesn’t get marked.  Therefore, only the OUTPUT gets marked.  When we call this procedure we’ll pass in the Name and Zip we’re interested in, and we’ll get back the Phone.  So if I wanted more than one piece of information, then I would specify more than one OUTPUT parameter like this:

CREATE PROCEDURE SP1
(
Name varchar(20),
Zip varchar(10),
Phone varchar(20) OUTPUT,
CustomerID int OUTPUT
)

AS

SQL Server OUTPUT Parameters: The Practice

Here’s a screenshot from one of our procedures in our FREE maintenance modules to show you how this will look IRL.

SQL Server Output Parameters

Ok, that’s really all there is to creating a stored procedure with an OUTPUT parameter.  And next time we’ll talk about how to get the parameter value into your calling procedure.

Article 2: Using OUTPUT Parameters

You can download our FREE maintenance modules below:
Minion Backup: Free SQL Server Backup utility with world-class enterprise features and full lifecycle management.
Minion Reindex: Free SQL Server Reindex utility with unmatched configurability.
Minion CheckDB: Free SQL Server CHECKDB that solves all of your biggest CHECKDB problems.