Monday, October 3, 2011

SQL Server Shrinking Datafiles and Reindexing, Breaking the Cycle

This post is in relation shrinking data files in SQL Server not the log file. They are two completely different beasts, shrinking your log file has little to no impact on your db, where shrinking data files has the potential to put you in a world of hurt. The pain you feel isn't the actual data shrinking, thats relatively harmless to the db, its the after effects. When you shrink a data file it moves all the filled data pages to the front of your data files, and takes the empty unused data files and puts them at the end, truncating the empty space as its last act.

What happens here is fragmentation of most of your indexes, particularly your large more dynamic indexes. When your data is fragmented bad query plans occur, because SQL determines its least expensive to do scans rather then expensive seeks or in particular partial scans on fragmented indexes. In order to solve this problem, you need to rebuild your fragmented indexes. However to rebuild indexes SQL needs to carve out a significant amount of space in the data file for the new rebuilt index. The result is all the space you got back in the shrinking process will be used in index creation, and often your back where you started or even worse off.

This is the primary reason you here all over the net, don't shrink your data files unless you need to. So the question is when is shrinking appropriate? Deletion of data is the primary reason you would want to shrink a data. A great example is some type of journaling or audit table that might have been taking inserts and or updates for a year or multiple years. After the exploding growth it's determined that it can be trimmed, and moving forward a regular trim process is put in place. However after the initial trim a significant amount of space in the datafile is unallocated, and will never be used. For if example 50% of your datafile is unused it makes sense to shrink. Unfortunately even in this situation you'll end up with the same issue as above shrinking and reindexing without gaining much ground.

The solution here is recreate the indexes of the large trimmed tables to another datafile, using a separate filegroup. When you recreate this index, it leaves the entire sum of the old index empty in the old data file. This gives you enough room to shrink and reindex, so that you can gain back significant amount of space. Before going into the procedures I need to warn add a warning that this should be done during a maintenance window, recreating an index will lock access to the table, for both reads and writes, including dirty reads. Even if your using Enterprise Edition which allows for online rebuilds, this is not a rebuild it's actually dropping and recreating index.

Procedures
  1. Identify Large Tables
  2. Create a filegroup for each table from step 1. I prefer creating 2 files per filegroup, for flexibility reasons in the future.
  3. For each of the clustered indexes on the tables identified use the create index with drop existing clause. In the example below I'm recreating a PK, so specifying "Unique clustered" not only maintains unique properties but also maintains any relationships that reference the index.
Example
create unique clustered index PK_T1_Index on
t1(f1)
with (drop_existing = on, sort_in_tempdb = on) on indexfg
  1. If you have other large indexes other then the clustered index, you might want to add them to the new filegroups as well. The more space you free up in the original file, the better off you are
  2. Once the indexes are moved, shrink the old datafile. Here you should be able to recapture plenty of space.
  3. After shrinking, then reindex any of the fragmented indexes in the original datafile. iSince these are smaller any space added from the reindex will be minimal.

No comments: