In this article, I am
giving a quick overview about INDEX in SQL Server. After completing this
article you will understand:
What is meant by INDEX?
What are different types of INDEX in SQL Server?
How to create INDEX?
How to DROP INDEX?
How to Enable-Disable INDEX?
What is difference between clustered and non-clustered
indexes?
What are the disadvantages of indexes?
Please give your valuable
suggestions and feedback to improve this article.
What is meant by INDEX?
Index is a database
object which is used to improve query performance for data retrieval but it
also adds some burden on data modification queries such as INSERT, UPDATE &
DELETE. So it is advised to use correct Index type. If your table gets many
data modification request then it is advised not to have Clustered index on
such table, because it will slow down you DML operations. Index can be created
on one or more column (max 16 columns) on existing table or views. When we
create the index, it reads the column’s data and forms a relevant data
structure to minimize the number of data comparison. Indexes are automatically
maintained for a table or view whenever the table data is modified.
Let’s understand Index
with some real world examples.
Example 1
Suppose A table CUSTOMERS
has around 20 Million records. When we try to retrieve records for two or three
customers based on their customer id, all 20 Million records are taken and
comparison is made to get a match on the supplied customer id. Just think about
now how much time that will take if it is a web based application and there are
30-50 customers that want to access their data through internet.
Does the data server do
20 Million X 30 searches? The answer is NO because all modern database
use the concept of Index.
Example 2
Suppose you have a book
which has thousands of pages, Thousands of pages are divided into many
chapters. Each chapter has many pages and each page has related topic written,
if you want to read a particular topic then how do you find it in your book?
Answer is you just look into the Table of Content which is
always present in the beginning from Chapter 1.This Table of contents helps
you search your topics in all topics and then you jump to a particular page out
of those thousands pages. So this Table of Content is nothing
but Index for your book. You better know how helpful are those table of content
10-20 pages, if it was not there then it would have been very difficult to
search needed topic in those thousands of pages. Table of content will always
be created after completing the book.
Let’s relate this
understanding with Database, now you assume your Book as a
Database Table and Table of Content as Index. After creating
and populating a Table or view, you create index on it. Once you create an
index on a field, automatically another data structure will be created which
holds the field value and pointer to the record it relates to. This index
structure is then sort and stores the data rows in the database which allows
Binary search on it, which means these Indexes require additional space on the
disk, and get stored with a table using MYISAM engine.
Summary
1. Like an index in a
book, an index in a database lets you quickly find specific information in a
table or indexed view.
2. An Index contains
keys built from one or more columns in a Table, or view and pointers that map
to the storage location of the specified data.
3. These keys are
stored in structure (B-Tree) that enables SQL Server to find the rows
associated with the key values quickly and efficiently.
4. We can
significantly improve the performance of database queries and applications by
creating well designed indexes to support your queries.
5. Indexes can
reduce the amount of data that must be read to return the query result set.
6. Indexes can also
enforce uniqueness on the rows in a table, ensuring the data integrity of the
table data.
What are different types of INDEX in SQL Server?
1. Clustered Index
2. Non Clustered
Index
3. Unique
4. Filtered
5. Xml
6. Full Text
7. Spatial
8. Columnstore
9. Index with Included
Columns
10. Index on Computed
Columns
In this article we will
focus only on Clustered and Non Clustered Index.
How to create INDEX?
We can create Index on an
empty table or one that contains data, creating an index on an empty table has
no performance implications at the time of Index is creation, however
performance will be affected when data is added to the table.
Creating Index on large
table should be planned carefully so that database performance is not hindered.
The best way to create Index on large table is to first create Clustered Index
and then create any Non Clustered Index.
When a table has no
Clustered Index then its data will be stored in unsorted way and it will be
called as Heap. When you insert a new record in Heap, it will be inserted in
any random position in the table and position depends on the page free space.
If A Clustered index is created on a heap with several existing nonclustered
indexes, all the nonclustered indexes must be rebuilt so that they contain the
clustering key value instead of the row identifier (RID).
Syntax to Create Index
CREATE [UNIQUE] CLUSTERED
|NONCLUSTERED INDEX index_name ON table_name|view_name (Column ASC|DESC [,..n])
Uniqueness can be a
property of both clustered and nonclustered indexes. If UNIQUE and CLUSTERED
keywords are missing in CREATE INDEX command then it will create Non-Unique Non
Clustered Index.
Clustered Index
1. When Clustered
Index is created it first sorts the data rows and then stores the data rows in
the table or view based on their key values.
2. Whenever new
records come into the table, it gets fit to its position based on the key
values that is the reason INSERT, UPDATE and DELETE are slow on INDEXED table
because it will take some time to rearrange the data Pages.
3. When we create
Primary Key constraint in a Table and if Clustered Index is not present on the
table then it will automatically create Clustered Index on primary key column
with the same name as the Primary Key constraint has.
4. We can have only
1 Clustered index on a Table.
5. Reason for having
only 1 Clustered Index on a table is the data rows themselves can be sorted in
only one order either ASC or DESC.
6. When a table has
Clustered Index then it will be called as Clustered Table.
7. When creating the
Clustered Index, SQL Server reads the Primary Key column values and forms a
Binary Tree on it. This Binary Tree information is then stored separately in
the disc, you can find Index created on Primary Key in database Index Folder.
8. Clustered Index
can be created on Non - Primary Key columns. When we create Primary Key later
in that case it will not create Clustered index because it is already
available, it will create Unique Non Cluster Index.
9. Uniqueness is the
property of Clustered Index and Non Clustered Index.
10. When we drop the
Index, It leaves the data in sorted order and if new records come then it will
be inserted at the random position in the table.
11. Clustered Index
leaf nodes contains the Data Pages means row’s data.
Let’s go through with
some example to understand Clustered Index
Let’s create a table name
as EMP in our tutorial database TEACHMESQLSERVER
USE TEACHMESQLSERVER
IF OBJECT_ID('EMP','U') IS NOT NULL
DROP TABLE EMP
CREATE TABLE EMP(EMPID INT NOT NULL)
Let’s insert few records
in the EMP table.
INSERT INTO EMP VALUES(1),(3),(2),(0)
GO
SELECT * FROM EMP
As you can see EMPID was
not sorted and stored because it doesn’t have Clustered Index on EMP Table. I
have already discussed that we can also create Clustered Index on Non Primary
Key columns.
Let’s Create Clustered
Index on EMPID column in EMP table.
CREATE CLUSTERED INDEX EMPIDINDEX ON EMP (EMPID ASC)
As you can see now EMP table has one Clustered Index name
as EMPIDINDEX.
Now let’s retrieve all the records we had inserted last
time from EMP table.
GO
SELECT * FROM EMP
As you can see once after
creating Clustered Index we retrieved the data, it is coming in sorted form. It
is because at the time of Clustered Index creation we had given sorting Order
as ASC. Data can be sort either ASC or DESC. This is the main reason we can
have only one Clustered Index on a Table.
Now let’s try to create
one more Clustered Index. We already know the answer that SQL Server will not
allow to create another Clustered Index on EMP table.
CREATE CLUSTERED INDEX EMPIDINDEX1 ON EMP (EMPID ASC)
Here you go – got error
message
Msg 1902, Level 16, State
3, Line 1
Cannot create more than
one clustered index on table 'EMP'. Drop the existing clustered index
'EMPIDINDEX' before creating another.
Let’s try to insert new
records
INSERT INTO EMP VALUES (8)
INSERT INTO EMP VALUES(6)
GO
SELECT * FROM EMP
As you can see from the
above result output, records are coming in ASC order even after we inserted 8
before 6.
As I have already
discussed if our table already has Clustered Index and Primary key constraint
was not defined at the time of table creation, if we create Primary Key
constraints using Alter command on EMP table then it will never create a
Clustered Index, It will create a Unique Non Clustered Index.
Let’s create Primary Key
constraint on EMPID column.
ALTER TABLE EMP ADD CONSTRAINT PRIMARYKEY PRIMARY KEY (EMPID)
As you can see PRIMARYKEY
INDEX is a Unique Non Clustered Index.
You can find all Indexes
list in SQL Server from SYS.INDEXES table.
SELECT * FROM SYS.INDEXES
Dropping index will not
affect the sorted stored records. Let’s Drop the INDEX. I will discuss how to
DROP Index in this article, if DROP command looks strange here then don’t
worry.
DROP INDEX EMPIDINDEX ON EMP
ALTER TABLE EMP DROP CONSTRAINT PRIMARYKEY
GO
SELECT * FROM EMP
As you can see Result set
is same after dropping the Index. Now let’s insert new records in EMP table.
INSERT INTO EMP VALUES (7)
GO
SELECT * FROM EMP
As you can see from the result set new records were
inserted at the end not its key value position.
NonClustered Index
1. We can have many Nonclustered (max 999) index on a Table.
2. Uniqueness is
the property of Clustered Index and Non Clustered Index.
3. By default a
Nonclustered Index is created if Clustered or Unique options are not specified
in the Create Index command.
4. A Nonclustered
index never store data in sorted order if your table has more than 1 column,
ASC and DESC keywords will be ignored if used at the time of Nonclustered index
creation. If your table has only column then only it will store data in sorted
order.
5. A Nonclustered
Index is stored separately from the table, so additional space is required. The
amount of space required will depend on the size of table and the number and
types of columns used in index.
6. A Nonclustered
Index contains the nonclustered Index key values and each key value entry has a
pointer to the data row that contains the key value.
7. The pointer from
an index row in a nonclustered index to a data row is called a row locator.
8. The Structure of
the row locator depends on whether the data pages are stored in a heap or a
clustered table. For a heap, a Row locator is a pointer to the row. For a
clustered Index, the row locator is the clustered index key.
9. Whenever you
create constraint, nonclustered index will be created.
Let’s go through with
some example to understand Nonclustered Index
Let’s create a table name
as EMP in our tutorial database TEACHMESQLSERVER
USE TEACHMESQLSERVER
IF OBJECT_ID('EMP','U') IS NOT NULL
DROP TABLE EMP
CREATE TABLE EMP(EMPID INT NOT NULL, DEPID INT NOT NULL)
Let’s insert few records
in the EMP table.
INSERT INTO EMP VALUES(1,11),(3,5),(2,4),(0,7),(1,1)
GO
SELECT * FROM EMP
As you can see from the
above result set records are not stored in sorted order. It was stored in the
sequence it was written in query.
Let’s Create Nonclustered
Index on EMPID column in EMP table.
CREATE NONCLUSTERED INDEX EMPIDINDEX ON EMP (EMPID ASC)
GO
SELECT * FROM EMP
As you can see from the above image Non-Uique,
Non-Clustered index has been created.
Now let’s retrieve all the records we had inserted last
time from EMP table.
GO
SELECT * FROM EMP
As you can see from the
above result set, records were not stored in sorted ordered even after defining
the ordering type (ASC) at the time of EMPIDINDEX Index creation.
Now let’s create one more
Nonclustered Index on DEPID column in EMP table. We have already discussed that
we can create many nonclustered index on a table.
CREATE NONCLUSTERED INDEX DEPIDINDEX ON EMP (DEPID DESC)
GO
SELECT * FROM EMP
Now once again let’s
retrieve all the records we had inserted last time from EMP table.
GO
SELECT * FROM EMP
As you can see from the
above result set, records were not stored in sorted ordered even after defining
the ordering type (DESC) at the time of DEPIDINDEX Index creation. This clearly
shows that Nonclustered Index never stores the data in any order.
Let’s insert a new record
in EMP table
INSERT INTO EMP VALUES (-2,6)
GO
SELECT * FROM EMP
As you can see from the
above result set, new record was inserted at the end of all records.
What is difference
between clustered and non-clustered indexes?
1. A table
can have multiple non-clustered indexes. But, a table can have only
one clustered index.
2. A clustered
index can be a disadvantage because any time a change is made to a value of an
indexed column, the subsequent possibility of re-sorting rows to maintain order
is a definite performance hit.
3. A
Clustered Index is faster than a Nonclustered Index but in some condition Non-Clustered index would perform better. You can refer to this article.
4. If
Clustered Index is disabled then you can’t perform DML operation, but if
Nonclustered index is disabled then also you can perform DML operation.
5. Clustered
Index determines the Storage Order of rows in the table, hence doesn’t require
additional disk space but where as a Nonclustered Index is stored separately
from the table, so additional space is required.
6. A
clustered index determines the order in which the rows of the table will be
stored on disk (asc or desc) – and it actually stores row level data in the
leaf nodes of the index itself. A non-clustered index has no effect on which
the order of the rows will be stored.
7. Using a
clustered index is an advantage when groups of data that can be clustered are
frequently accessed by some queries. This speeds up retrieval because the data
lives close to each other on disk. Also, if data is accessed in the same order
as the clustered index, the retrieval will be much faster because the physical
data stored on disk is sorted in the same order as the index.
8. Non
clustered indexes store both a value and a pointer to the actual row that holds
that value. Clustered indexes don’t need to store a pointer to the actual row
because of the fact that the rows in the table are stored on disk in the same
exact order as the clustered index – and the clustered index actually stores
the row-level data in it’s leaf nodes.
How to DROP INDEX?
Implicitly created index
can’t be dropped using DROP INDEX command which means you can’t drop the
Indexes if Indexes were created automatically by any of Primary Key or Unique
constraint. To drop the index you have to drop the constraint it will automatically
delete the Index.
If a clustered index is
dropped on a table that has several nonclustered indexes, the nonclustered
indexes are all rebuilt as part of the DROP operation. This may take
significant time on large tables.
How to Enable-Disable
INDEX?
DROP INDEX index_name ON
table_name|view_name
Syntax to Disable Index
You can disable a
particular Index using below query syntax
ALTER INDEX Index_name ON
table_name|view_name DISABLE
You can disable all
available Index using below query syntax
ALTER INDEX ALL ON
table_name|view_name DISABLE
When you disable the
clustered index then you can’t perform DML (INSERT, UPDATE & DELETE)
operation. If you try to do then you will get below error message
Msg 8655, Level 16, State
1, Line 1
The query processor is
unable to produce a plan because the index ‘Index_name’on table or view
'table_name or View_name' is disabled.
You can perform DML
operation if your nonclustered index is disabled.
Syntax to Enable Index
You can enable a
particular Index using below query syntax
ALTER INDEX Index_name ON
table_name|view_name REBUILD
You can enable all
available Index using below query syntax
ALTER INDEX ALL ON
table_name|view_name REBUILD
What are the disadvantages
of indexes?
Additional Disk Space
– Clustered Index does not require any
additional storage. Every Non-Clustered index requires additional space as it
is stored separately from the table. The amount of space required will depend
on the size of the table, and the number and types of columns used in index.
Insert, Update and Delete Operation can
become Slow – When
DML statements modify data in a table, the data in all the indexes also needs
to be updated. Indexes can help, to search and locate the rows, that we want to
delete, but too many indexes to update can actually hurt the performance of
data modifications.
Click
here to know how to create
index on View.
I hope you enjoyed the article don't forget to share on Facebook. Please leave your comments below if there are any questions.
I hope you enjoyed the article don't forget to share on Facebook. Please leave your comments below if there are any questions.
3 comments:
Good effort but a few inaccuracies in the article -
Ordered return of data, even with a clustered index, is not guaranteed without an ORDER BY clause.
Please research more on these two statements from your article. The first one is not always true, NC can be faster than CI in many cases. The second one is incorrect because CI (or any physical construct) needs disk space -
3. A Clustered Index is faster than a Nonclustered Index.
5. Clustered Index determines the Storage Order of rows in the table, hence doesn’t require additional disk space
Thanks for your comment.
I will definitely look into your suggestions and I will also update the content if required.
Nyce thanks
Post a Comment