Which is the performance killer: SELECT INTO or INSERT INTO?

There are many ways to kill performance in a script or stored procedure.  However, not many think about the possibility that adding columns to a temporary table can kill performance.  Here, I’m going to show you how a simple design decision – using INSERT vs ALTER TABLE – can have a huge impact on your procedure performance.

This comes from a procedure we wrote for Minion Reindex to greatly improve your index maintenance by allowing you to have some impressive dynamic routines without any extra jobs.

INSERT vs ALTER TABLE – SELECT INTO and ALTER TABLE

We recently wrote a helper procedure to allow you to limit the rows for your reindex runs with Minion Reindex.  We got it working just like we wanted it, and it returned in just a couple seconds.  The problem is that sometimes it would take a couple seconds, and others it would take a couple minutes.  Then it started always taking two minutes, and we weren’t able to get the performance back down at all.

We added indexes, we ran with recompile, we limited our result sets, etc.  We did everything you usually do to fix the performance of a stored procedure.  Then I noticed that it was loading data with a SELECT INTO instead of INSERT/SELECT.  What the code actually did was, it created a table with some rows and then added a couple columns and then updated those rows.

Here’s a look at the actual code:

            SELECT  
                    F.DBID
                  ,   F.TableID AS   objectID
                  ,   F.IndexID
                  ,   F.DBName
                  ,   F.SchemaName
                  ,   F.TableName
                  ,   F.IndexName
            INTO    #IndexTableFrag
            FROM    Minion.IndexTableFrag  AS   F
            INNER JOIN  ( SELECT    MAX(ExecutionDateTime) AS   ExecutionDateTime
                                  ,   DBName
                          FROM      Minion.IndexTableFrag
                         GROUP BY   DBName
                        )  AS   M
                     ON F.ExecutionDateTime   = M.ExecutionDateTime
                        AND F.DBName   = M.DBName
            WHERE   F.DBName   LIKE @DBName;
	   
	   /*	Minion.IndexTableFrag has a separate ExecutionDateTime for each database. 
			So, this INNER JOIN gets the most recent date for each DB.
		*/

----------------------------------
-- 2. Get the most recent page count and row count information from Minion.IndexPhysicalStats.

            ALTER TABLE #IndexTableFrag ADD page_count BIGINT NULL;
            ALTER TABLE #IndexTableFrag ADD record_count BIGINT NULL;

            CREATE NONCLUSTERED INDEX ix14587684759 ON #IndexTableFrag (DBID, ObjectID, IndexID);

            UPDATE  I
            SET     I.page_count =   S.page_count
                  ,   I.record_count =   S.record_count
            FROM    #IndexTableFrag  AS   I
            INNER JOIN  Minion.IndexPhysicalStats  AS   S
                     ON I.DBID   = S.database_id
                        AND I.objectID  = S.object_id
                        AND I.IndexID   = S.index_id
            INNER JOIN  ( SELECT    MAX(ExecutionDateTime) AS   ExecutionDateTime
                                  ,   database_id
                          FROM      Minion.IndexPhysicalStats
                         GROUP BY   database_id
                        )  AS   M
                     ON S.ExecutionDateTime   = M.ExecutionDateTime
                        AND S.database_id   = M.database_id;

So the synopsis is: Select Into, add two columns, update those two columns.  That’s it.  And most of the time was spent on the alter table statements.

I’ll go ahead and say that each of these tables only has about 13,000 rows in them, so it’s not like we’re dealing with millions of rows here.  This is the part of the procedure that was taking so long and what needed to be corrected.

INSERT vs ALTER TABLE – INSERT

Now that we’ve seen the original poor-performing code, let’s take a look at the fixed code.

CREATE TABLE #IndexTableFrag(
	[objectID] [bigint] NOT NULL,
	[DBName] [varchar](400) NOT NULL,
	[SchemaName] [varchar](400) NOT NULL,
	[TableName] [varchar](400) NOT NULL,
	[IndexID] [bigint] NOT NULL,
	[IndexName] [varchar](400) NOT NULL,
	page_count BIGINT NULL,
	record_count BIGINT NULL
) 
----------------------------------
-- 1. Get the most recent index fragmentation information from Minion.IndexTableFrag.
DBCC TRACEON(610);
			INSERT INTO  #IndexTableFrag
			(ObjectID, DBName, SchemaName, TableName, IndexID, IndexName)
            SELECT  -- F.ExecutionDateTime
                    F.TableID AS    objectID
                  ,    F.DBName
                  ,      F.SchemaName
                  ,      F.TableName
                  ,      F.IndexID
                  ,      F.IndexName
            
            FROM    Minion.IndexTableFrag  AS      F
            INNER JOIN  ( SELECT    MAX(ExecutionDateTime) AS      ExecutionDateTime
                                  ,      DBName
                          FROM      Minion.IndexTableFrag
             GROUP BY               DBName
                        )  AS      M
                     ON F.ExecutionDateTime      = M.ExecutionDateTime
                        AND F.DBName      = M.DBName
            WHERE   F.DBName LIKE @DBName;
	   
	   /*	Minion.IndexTableFrag has a separate ExecutionDateTime for each database. 
			So, this INNER JOIN gets the most recent date for each DB.
		*/

----------------------------------
-- 2. Get the most recent page count and row count information from Minion.IndexPhysicalStats.

			CREATE NONCLUSTERED INDEX ix14587684759 ON #IndexTableFrag (DBName,SchemaName,TableName,IndexName);

            UPDATE  I
            SET     I.page_count =      S.page_count
                  ,      I.record_count =      S.record_count
            FROM    #IndexTableFrag  AS      I
            INNER JOIN  Minion.IndexPhysicalStats  AS S
                     ON I.DBName = S.DBName
                        AND I.SchemaName = S.SchemaName
                        AND I.TableName = S.TableName
                        AND I.IndexName = S.IndexName
            INNER JOIN  ( SELECT MAX(ExecutionDateTime) AS ExecutionDateTime
                                  , DBName
                          FROM      Minion.IndexPhysicalStats
             GROUP BY               DBName
                        )  AS      M
                     ON S.ExecutionDateTime      = M.ExecutionDateTime
                        AND S.DBName      = M.DBName;

Here we’ve replaced the SELECT INTO with an explicit temp table creation and an INSERT.  This allows us to name the columns we want to insert, saving the two columns at the end to be updated afterward.  See, we’re creating the table with those two columns that were added in the previous code so we can bypass that ALTER TABLE step from above.

 

INSERT vs ALTER TABLE – Bulk Inserts

You may be wondering why a full-blown bulk insert in the form of a SELECT INTO isn’t faster than a logged INSERT.  Well the answer is two-fold.

First, it’s not really the SELECT INTO that was slow, but the ALTER TABLE statements, so that part remains solid.

Second, I am getting my bulk insert from my INSERT statement because not only is the database in simple mode, but you’ll notice I’ve also turned on trace flag 610 (gives me the ability to do bulk inserts when there’s a clustered index).  Now, there’s not currently a clustered index on that temp table, but I don’t want any mistakes in the future.

You may find it hard to believe, but you actually can’t control bulk insert behavior anyway.  I’ll write another blog on that, with a nice discussion on bulk inserts.

 

Download our FREE maintenance modules:

  • Minion Backup: Free SQL Server Backup utility with world-class enterprise features and full lifecycle management.
  • Minion Reindex: Free SQL Server Reindex utility with unmatched configurability.
  • Minion CheckDB: Free SQL Server CHECKDB that solves all of your biggest CHECKDB problems.