Tuesday, February 24, 2009

SQL Server Memory Usage With Covered Indexes

In my last blog I talked about clearing memory caches to free up memory. Most of the time this is a bit of a hack. The true solution lies in shrinking how much data actually goes into memory, which falls upon the size of your data structures. In particular how much data can you fit into a SQL Server data page. SQL Server maintains a one to one relation between it’s data pages on disk, and the data pages brought into memory. It’s also an all or nothing deal, meaning if you need one column of 4 bytes from one row, you need to bring the entire 8KB data page up into memory. In this blog I’ll talk about how to use covered indexes to maximize these data pages, so you’re not using memory on unneeded datum.

Before getting going I want to provide a data setup script as well as memory consumption query that will be used through out this post. The memory consumption query is a knock off of the one found in BOL under sys.dm_os_buffer_descriptors, with a couple of tweaks.

--Data Setup

Create table Comments
(pkcomment int identity(1,1),
CommentType int,
CommentDate datetime,
CommentText char(2000),
CONSTRAINT Pk_intT1 PRIMARY KEY CLUSTERED (PkComment))


Declare @int int
set @int = 1
While @int <>
Begin
insert into Comments(CommentType, CommentDate, CommentText) select datepart(ms, getdate()), Getdate(), replicate('A', 2000)
set @int = @int + 1
end

--Memory Consumption Script

SELECT count(*)AS cached_pages_count,
(count(*) * 8)/1024 as SizeInMb, name ,index_id
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN
(
SELECT object_name(object_id) AS name
,index_id ,allocation_unit_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND (au.type = 1 OR au.type = 3)
) AS obj
ON bd.allocation_unit_id = obj.allocation_unit_id
WHERE database_id = db_id()
GROUP BY name, index_id
ORDER BY cached_pages_count DESC


Now that we’re ready to go, let me explain, what we have. The newly created comments table is a fairly common structure for holding user comments. The bulk of the table is the comment itself with some metadata to describe it. CommentType might map to; Question, Comment, Authors reply, Etc, while commentDate is the Date of the comment. In this current setup the root pages of the clustered index is capable of holding only 3 rows per data page, because of the large size of the comments field. Before starting let’s flush the memory.

checkpoint
dbcc Dropcleanbuffers


Assume that in the application one of the calls aggregates the count of this table, by type and date. Notice in this query that text is never referenced.


select count(*), CommentType, CommentDate from Comments
group by CommentType, CommentDate

Once complete run the memory consumption query. You’ll notice 23279 data pages or 181MB’s of memory are being consumed here. The ironic part is most of the data allocated is not needed. This is where covering your index comes in. By covering your index you’re creating a b-tree data structure with data only required to fullfill the data requirements of one or more specific queries. Let me preference the last statement in that your index will contain the values of your primary key for reference. Below is the statement to create the index.

create index coveredtest on Comments(CommentType, CommentDate)

Once Complete rerun the following Queries, by clearing the memory, running the aggregate statement, then running the memory consumption script. The consumption script is left out below to save space.


checkpoint
dbcc Dropcleanbuffers

select count(*), CommentType, CommentDate from Comments
group by CommentType, CommentDate

Not only did the query return considerably faster but the data page usage is down to 211 or 2 MB. Over the entire server this will become a significant savings. Before ending this, I need to add fair warning, as know your application before implementing. If modifications are occuring regularly, or your comments are being queried in such a way that the commentText column is being read regularly, then it does belong in memory, and your savings will not be that significant, and might actually hurt your memory pool. On that my last word in the real world you’d use a varchar over a char, but this was easiest to make a point.

No comments: