Thursday, July 9, 2009

Indexing: Clustered vs. Non-clustered

Here’s what I’ve learned, but take it all with a grain of salt . . . In attending conferences & through discussions, it sounds like indexes are still a highly debated topic, but here’s what I’ve got so far:

* Every table can have only one clustered index and up to 200+ nonclustered indexes

* Every table should have a clustered index

* A clustered index is typically a fairly unique reference to a single row in the dataset, but it doesn’t have to be unique – it can be created as “Create Clustered Index Idx1 on t1(c)”

* Clustered indexes are usually surrogate keys or the columns on which tables join (e.g. PersonNum) – when doing a index seek, the server looks for the clustered index

* Clustered indexes are a sorting of the data

* Non-clustered indexes can be on anything useful :-)

* A non-clustered index works by keeping a “index” of which clustered index contains the non-clustered index information. When doing a index seek, the server looks at the non-clustered index which points it to which clustered index record(s) contain that information.


The typical analogy is that a clustered index is similar to page numbers (“page 3”) where as a nonclustered index is similar to a book index (e.g. ICD-9’s: “lumbosacral sprains is found on page 327”).

Issues currently being debated include: Should the clustered index be on the business key or the surrogate key? For incremental loading should the clustered index be on fields like loadTS? How many is too many indexes? . . . . There doesn’t seem to be any clear answers at the conferences & forums. So, we’re not alone in being unsure.

Here's a simple sample of indexing a temp table:

create table #ProductNumbers
(ProductID int,
ProductName varchar(50),
QuantityPerUnit varchar(50),
UnitPrice numeric 15,2)

create clustered index ProductIDIndex on #ProductNumbers(ProductID)
create nonclustered index ProductNameIndex on #ProductNumbers(ProductName)

insert into #ProductNumbers (ProductID, ProductName, QuantityPerUnit, UnitPrice)

Select
ProductID,
ProductName,
QuantityPerUnit,
UnitPrice
from Northwinds.dbo.Products
where Discontinued = 'True'

No comments:

Post a Comment