Labels

Friday, January 9, 2009

06 - Indexes (DDL)

Hi,

 

Here we'll cover below –

 

-          Table

-          Constraints

-          View

-          SP

-          Functions

-          Index

-          Triggers

 

 

Indexes

 

·          Indexes assist with query processing by speeding up data access against tables and views.

·          Indexes are defined by selecting an index key, which is made up of one or more columns from the table or view.

·          If indexes do not exist for a table, that table is called a Heap; meaning the data pages are stored in no particular order. A heap referenced in queries results in a table scan. A table scan occurs when all rows in the table are evaluated against the required end-result (instead of just a subset of the data).

 

·          Two types of indexes, Clustered And Nonclustered.

·          SQL Server uses a B-Tree data structure for each clustered and nonclustered index.

·          When an index is used, SQL Server navigates from the Btree structure's root and traverses down to the leaf nodes.

 

Clustered Indexes

 

·          Determine the physical order of data in a table.

·          You can only have one clustered index for each table, as the data pages can only be physically stored in one way

·          After a clustered index is created, data pages are physically contiguous, which can assist with the speed of queries.

·          For a table with a clustered index, the leaf nodes are the data pages themselves.

 

·          Good candidates  - Columns that are

o         Queried often in range queries (Between, <, >)

o         Order large result sets,

o         Used in aggregate functions, and

o         Whose contents are distinct (primary or unique keys)

o         Lastly, Choose a Smaller Clustered Index Key over a wider column, reducing the size of the index needed and improving I/O operations.

 

·          Bad candidates  - Columns that are

o         Frequently updated columns and

o         Non-unique columns

 

 

Non-Clustered Indexes

 

·          It stores index pages separately from the physical data, with pointers to the physical data located in the index pages and nodes. In a way it contains poitner to the Clustered Index leaf pages

·          Nonclustered index columns are stored in the order of the index key column values.

·          You can have up to 249 nonclustered indexes on each table or indexed view.

·          For nonclustered indexes, the leaf node level is the Index Key Coupled To A Bookmark.

·          The nonclustered index bookmark points to the B-Tree structure of the table's clustered index (if one exists). If the base table is a heap, the nonclustered index bookmark points to the table row's Row-ID.

 

 

·          Good candidates  - Columns that are

o         Used in Where clause

o         Order smaller result sets on unique data

o         involved in searchable arguments using operators such as =, <, >, IN, EXISTS, and LIKE.

 

 

 

 

 

To Index or not to Index?

 

·          Indexes take up space, and should not be added 'just in case'. View Tools for testing the effectiveness of your indexes.

·          Indexes can slow down data loads significantly, and may not make sense for tables that are used expressly for data Imports And Staging.

·          If your table will not be queried and is used as an intermediate staging ground, leaving indexes off the table may be a better choice.

·          The same goes for very small tables where, no matter what, the query optimizer always performs a table scan.

·          Keep in mind that heaps become naturally fragmented over time and, if data updates are frequent enough, even small tables can begin to take up more space than necessary. There are those who believe that a clustered index should exist for every table in the database, period.

·          One technique for reducing bulk load operations on a table is to remove the index during the load and add the index once the load is finished

 

 

CREATE [ UNIQUE ] [ CLUSTERED | NONCLUSTERED ] INDEX index_name

ON { table | view } ( column [ ASC | DESC ] [ ,...n ] )

[ WITH < index_option > [ ,...n] ]

[ ON filegroup ]

< index_option > :: =

{ PAD_INDEX | FILLFACTOR = fillfactor | IGNORE_DUP_KEY | DROP_EXISTING | STATISTICS_NORECOMPUTE | SORT_IN_TEMPDB }

 

            CREATE CLUSTERED INDEX idx_BookID ON Books (iBookId)

       CREATE NONCLUSTERED INDEX idx_Name_Author ON Books (vchBookName, iAuthorId)

       CREATE NONCLUSTERED INDEX idx_Name_Author ON Books (vchBookName, iAuthorId) WITH FILLFACTOR = 75, PAD_INDEX     -- Specify a fill factor and leaving space on the index page

CREATE NONCLUSTERED INDEX idx_Name_Author ON Books (vchBookName, iAuthorId) WITH SORT_IN_TEMPDB ON FG2           -- Placing the index on a different file group and sorting in tempdb

 

           

PAD_INDEX

 

-          PAD_INDEX is used if FILLFACTOR is specified.

-          PAD_INDEX specifies the space to leave open on each index page.

-          If not  selected, SQL Server makes sure the index page has enough empty space for one row.

 

FILLFACTOR = fillfactor

 

-          The FILLFACTOR percentage determines how full the leaf level of the index pages should be when the index is first created.

-          Tables with high update-to-read ratios should consider modifying the FILLFACTOR to leave space for new rows; otherwise, excessive page splitting could occur.

-          A page split operation occurs when an index page  becomes full and SQL Server must split the rows of the index page to a new index page to make room for new rows.

-          Specifying both 0 and 100% translates to a leaf page that is 100%full.

-          When a fill factor is too low, database queries can suffer, as more data pages must be read to fulfill the query request.

 

IGNORE_DUP_KEY

 

-          This option can be used for unique clustered and unique

-          nonclustered indexes.

-          When set, warnings are issued when a duplicate row is added to the table, but the transaction is allowed.

-          If not used, an error is raised and the transaction is rolled back.

 

DROP_EXISTING

 

-          When chosen, an existing index is dropped and rebuilt.

 

STATISTICS_NORECOMPUTE

 

-          When set, statistics are not automatically recomputed after the index is finished. An explicit UPDATE  STATISTICS must be executed.

 

SORT_IN_TEMPDB

 

-          This option sorts results for the index build in tempdb, instead of in the table's database, potentially reducing the creation time (depending on the physical location of tempdb), but increasing the amount of space needed in tempdb for the sorting operation.

 

 

 

 

Hope this helps.

 

 

Thanks & Regards,

Arun Manglick || Tech Lead

No comments:

Post a Comment