|
sql
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Covering indexes question
and DateSold columns, and I do a Select query that filters by the DateSold column only, will that index be used? Or do I need a separate index for only the DateSold column? Covering indexes are confusing, even though I have read Kalen and other people on the topic. At one time, I thought that if 5 columns of a table are searched with various queries, then, instead of creating 5 separate indexes, I should just create ONE index that uses all 5 columns... and whichever column was used in any query, the covering index would be enough. But somehow I don't think that's the right answer. Thanks for any clarification. David Walker You need to differentiate between seek and a scan. consider a
telephone directory (being the index) and the real world (being the data). You can seek for the last name. You can scan for a first name. As long as you are only interested in info which is in the directory, then it covers, but if you need eye colour, you need to visit data-pages. -- Show quoteHide quoteTibor Karaszi, SQL Server MVP http://www.karaszi.com/sqlserver/default.asp http://sqlblog.com/blogs/tibor_karaszi "David Walker" <n***@none.com> wrote in message news:Xns9B5B695DDE288DavidWalker@207.46.248.16... > If I have a nonclustered index in a Sales table that uses the > CustomerId > and DateSold columns, and I do a Select query that filters by the > DateSold > column only, will that index be used? > > Or do I need a separate index for only the DateSold column? > > Covering indexes are confusing, even though I have read Kalen and > other > people on the topic. > > At one time, I thought that if 5 columns of a table are searched > with > various queries, then, instead of creating 5 separate indexes, I > should > just create ONE index that uses all 5 columns... and whichever > column was > used in any query, the covering index would be enough. > > But somehow I don't think that's the right answer. > > Thanks for any clarification. > > > David Walker That index would be used, but the entire index would be scanned. For
example, suppose the table had 1,000,000 rows averaging 80 bytes each. Suppose also that the table has 20 years of data in it so that there are about 50,000 rows for each year. And your query is to get the count of sales for each customer which had at least one sale in 2005: Select CustomerID, Count(*) From Sales Where DateSold >= '20050101' And DateSold < '20060101' Then if you don't have a index that can be used, SQL will have to scan the whole table which is about 80MB (1,000,000 rows * 80 bytes per row). If you have a nonclustered index on CustomerID, DateSold, that index covers the query and can be used, however SQL will need to scan the entire index. But each leaf level entry in the index will take about 4 bytes for CustomerID (assuming it is an int) plus 8 bytes for the DateSold (assuming it is a datetime) plus the length of any clustered index key columns other than CustomerID and DateSold (let's assume that's another 4 bytes). So the total is 16 bytes for each leaf level entry. So the total size of the index will be on the order of 16MB (1,000,000 rows * 16 bytes per row) (actually more depending on how much empty space is in the index, but you get the idea). So this index will run this query much faster than if you didn't have the index. But if you have a nonclustered index on DateSold, then each leaf level entry takes 12 bytes (8 for DateSold and 4 for the clustered index key). And since all of the 2005 rows are together in the index, only the 50,000 rows for the year 2005 need to be scanned, so that is about 600,000 bytes (plus overhead). So this index is much better for this query than a nonclustered index on CustomerID, DateSold. Tom Show quoteHide quote "David Walker" <n***@none.com> wrote in message news:Xns9B5B695DDE288DavidWalker@207.46.248.16... > If I have a nonclustered index in a Sales table that uses the CustomerId > and DateSold columns, and I do a Select query that filters by the DateSold > column only, will that index be used? > > Or do I need a separate index for only the DateSold column? > > Covering indexes are confusing, even though I have read Kalen and other > people on the topic. > > At one time, I thought that if 5 columns of a table are searched with > various queries, then, instead of creating 5 separate indexes, I should > just create ONE index that uses all 5 columns... and whichever column was > used in any query, the covering index would be enough. > > But somehow I don't think that's the right answer. > > Thanks for any clarification. > > > David Walker Tom
Another option could be having a CI on DateSold and NCI on CustomerID..... to cover the query Show quoteHide quote "Tom Cooper" <tomcooper@comcast.no.spam.please.net> wrote in message news:uQYwv1oSJHA.3584@TK2MSFTNGP06.phx.gbl... > That index would be used, but the entire index would be scanned. For > example, suppose the table had 1,000,000 rows averaging 80 bytes each. > Suppose also that the table has 20 years of data in it so that there are > about 50,000 rows for each year. And your query is to get the count of > sales for each customer which had at least one sale in 2005: > > Select CustomerID, Count(*) > From Sales Where DateSold >= '20050101' And DateSold < '20060101' > > Then if you don't have a index that can be used, SQL will have to scan the > whole table which is about 80MB (1,000,000 rows * 80 bytes per row). > > If you have a nonclustered index on CustomerID, DateSold, that index > covers the query and can be used, however SQL will need to scan the entire > index. But each leaf level entry in the index will take about 4 bytes for > CustomerID (assuming it is an int) plus 8 bytes for the DateSold (assuming > it is a datetime) plus the length of any clustered index key columns other > than CustomerID and DateSold (let's assume that's another 4 bytes). So > the total is 16 bytes for each leaf level entry. So the total size of the > index will be on the order of 16MB (1,000,000 rows * 16 bytes per row) > (actually more depending on how much empty space is in the index, but you > get the idea). So this index will run this query much faster than if you > didn't have the index. > > But if you have a nonclustered index on DateSold, then each leaf level > entry takes 12 bytes (8 for DateSold and 4 for the clustered index key). > And since all of the 2005 rows are together in the index, only the 50,000 > rows for the year 2005 need to be scanned, so that is about 600,000 bytes > (plus overhead). So this index is much better for this query than a > nonclustered index on CustomerID, DateSold. > > Tom > > "David Walker" <n***@none.com> wrote in message > news:Xns9B5B695DDE288DavidWalker@207.46.248.16... >> If I have a nonclustered index in a Sales table that uses the CustomerId >> and DateSold columns, and I do a Select query that filters by the >> DateSold >> column only, will that index be used? >> >> Or do I need a separate index for only the DateSold column? >> >> Covering indexes are confusing, even though I have read Kalen and other >> people on the topic. >> >> At one time, I thought that if 5 columns of a table are searched with >> various queries, then, instead of creating 5 separate indexes, I should >> just create ONE index that uses all 5 columns... and whichever column was >> used in any query, the covering index would be enough. >> >> But somehow I don't think that's the right answer. >> >> Thanks for any clarification. >> >> >> David Walker > > David,
"Covering Indexes" are about increasing the performance of very specific and fequently used queries. The performance comes from allowing the server to fulfill the query entirely from the non-clustered index without having to process from the full body of data in the clustered index (or heap). The columns of the WHERE clause should be the first columns of the index and the columns of the SELECT clause should be the last columns of the index. If you need to frequently look up customer name by customer id, then you would want this index: CREATE INDEX CustIDName on XXX (CustID, LName, FName); If you need to frequently look up user names from e-mail addresses, then you would use this: CREATE INDEX EMailUName on XXX (EMail, UserID); As always, every index speeds SELECTs and slows INSERTs, so add indexes judiciously. Show quoteHide quote "David Walker" <n***@none.com> wrote in message news:Xns9B5B695DDE288DavidWalker@207.46.248.16... > If I have a nonclustered index in a Sales table that uses the CustomerId > and DateSold columns, and I do a Select query that filters by the DateSold > column only, will that index be used? > > Or do I need a separate index for only the DateSold column? > > Covering indexes are confusing, even though I have read Kalen and other > people on the topic. > > At one time, I thought that if 5 columns of a table are searched with > various queries, then, instead of creating 5 separate indexes, I should > just create ONE index that uses all 5 columns... and whichever column was > used in any query, the covering index would be enough. > > But somehow I don't think that's the right answer. > > Thanks for any clarification. > > > David Walker
Other interesting topics
Indexes - Clusteredand Non-clustered
Is it possible to empty and drop the orginal mdf file of the Primary filegroup? Query help with month\year selection Alerts on Error Numbers Can't install AdventureWorks - Full Text Search "missing" Data from remote tables: filtering? SQL 2008 Export to CSV. Numeric Fields quoted. WHY? Returning result set rows before query is completed? Removing Identity from column How to determin the last time a SQL 2000 database was written to |
|||||||||||||||||||||||