×

Loading...

Clustered index and non- clustered index

It seems that many people don’t fully understand what clustered index is and why use it.

For performance purpose, clustered index and non- clustered indexes can be used. A clustered index actually creates a linked data structure for a table (or physically sorting the table); so only one clustered index can be used in a table (You cannot physically sort data in two or more orders). If the clustered index is created for a table, looping up data will rely on it. Otherwise, it relies on IAM (Index Allocation Map) pages.

The non-clustered indexes can be created for tables no matter it’s clustered table or heap.

Although it is said that INSERT operation is quicker in heap than in clustered table, the clustered table has many benefits over the heap. In most cases, people should choice clustered table.

In addition, though performance can be improved by creating indexes, maintenance (defragment) is also important. After inserting/deleting/updating frequently, a table becomes fragmented. It causes poor performance. Defragment then can solve the problem.

Greg Robidoux's Clustered Tables vs Heap Tables explains these ideas very clearly.
Sign in and Reply
Modify
Report

Replies, comments and Discussions:

  • 工作学习 / 学科技术讨论 / Clustered index and non- clustered index
    • The article is very SQL Server specific. I actually disagree with the author wrt the importance of the choice between clustered table and heap table. A designer should care more about the relationship of data rather than physical arrangement.
      If you get your model right, database can arrange data appropriately, e.g. derive the correct clustered index. If you have to define clustered index explicitly, you are more likely creating new problems.

      It seems SQL Server is very bad at inferring correct storage order, and exposes the choice of clustered index directly to data model designer. Even there, if you get your primary key right, you get your "clustered index". Then, you get a very portable and upgrade-able design.

      You might argue primary key is the most common way to define clustered index in SQL Server. But, what's the point of this discussion then? A database has to make decision on how to store data. Some databases generate internal "_rowId" as internal "clustered index", if they cannot find suitable unique index. In those database, you always have "clustered index" no matter if you define any index or not.
      • I think that discussing witch database is better is totally out of topic. And I believe the actual question is should we use clustered data or heap data.