This website requires Javascript to function properly. Please go to the setting of your web browser and enable Javascript for this website.

×

Loading...

Clustered index and non- clustered index

deep_blue(BLUE)

It seems that many people don’t fully understand what clustered index is and why use it.

For performance purpose, clustered index and non- clustered indexes can be used. A clustered index actually creates a linked data structure for a table (or physically sorting the table); so only one clustered index can be used in a table (You cannot physically sort data in two or more orders). If the clustered index is created for a table, looping up data will rely on it. Otherwise, it relies on IAM (Index Allocation Map) pages.

The non-clustered indexes can be created for tables no matter it’s clustered table or heap.

Although it is said that INSERT operation is quicker in heap than in clustered table, the clustered table has many benefits over the heap. In most cases, people should choice clustered table.

In addition, though performance can be improved by creating indexes, maintenance (defragment) is also important. After inserting/deleting/updating frequently, a table becomes fragmented. It causes poor performance. Defragment then can solve the problem.

Greg Robidoux's Clustered Tables vs Heap Tables explains these ideas very clearly.

(#8065415@0)
Last Updated: 2013-2-20

Sign in and Reply

The Blog of BLUE.NET

Replies, comments and Discussions:

工作学习 / 学科技术讨论 / Clustered index and non- clustered index -deep_blue(BLUE); 2013-2-20 {1304} (#8065415@0)

The article is very SQL Server specific. I actually disagree with the author wrt the importance of the choice between clustered table and heap table. A designer should care more about the relationship of data rather than physical arrangement. -geekcode(吉克码工); 2013-2-21 {899} (#8065939@0)
If you get your model right, database can arrange data appropriately, e.g. derive the correct clustered index. If you have to define clustered index explicitly, you are more likely creating new problems.

It seems SQL Server is very bad at inferring correct storage order, and exposes the choice of clustered index directly to data model designer. Even there, if you get your primary key right, you get your "clustered index". Then, you get a very portable and upgrade-able design.

You might argue primary key is the most common way to define clustered index in SQL Server. But, what's the point of this discussion then? A database has to make decision on how to store data. Some databases generate internal "_rowId" as internal "clustered index", if they cannot find suitable unique index. In those database, you always have "clustered index" no matter if you define any index or not.

I think that discussing witch database is better is totally out of topic. And I believe the actual question is should we use clustered data or heap data. -deep_blue(BLUE); 2013-2-21 (#8066105@0)

More Topics

How does it work?

大模型解决不了英伟达的难题，AI新范式必将出现：专访安克创新CEO阳萌

378.9Tb/s！日本是怎么用一根普通光纤，搞出有史以来最快网速的？

面对西方的封锁，习近平终于明白，科技不能光靠抄袭。

在做一个新功能, 想把手机上的图片直接通过“分享”发到论坛上. 偷懒让ChatGPT给个方案, 结果不仅是错的, 而且是很离谱的错, 把不相干的东西拼凑到一起, 还解释得头头是道的.

枫下论坛主坛 / 工作学习 / 学科技术讨论