Introduction
As DBA's, we encounter long running queries that end-users compalin about. Indexing strategy is one area of database where developers usually tend to create indexes on the fly without much consideration for or analysis on how an index that helps one query marginally can slow down a bunch of other queries or stored procedures drastically. Some (not all) of the situations that we usually come across are following:
Details of the Stored procedure
The text of the store procedure is given below:
If exists (select 1 from sys.procedures
where USER_NAME(schema_id) = 'dbo'
and name = 'dbp_Find_Index_Properties_by_Column'
and type = 'P')
begin
drop procedure dbo.dbp_Find_Index_Properties_by_Column
end
go
create procedure dbo.dbp_Find_Index_Properties_by_Column
@i_Table_Name varchar(64) = null
,@i_Column_Name varchar(24) = null
,@l_Column_Position_in_Index tinyint = null
as
*******************************************************************************************************************************
Created by: Suresh K Maganti
Purpose : Find if a column is used in an index. View additional properties of the index.
For now, I have included only clustered and non-clustered indexes.
Input
Parameters: If provided, the result set is appropriately filtered. Else, the complete result set is returned.
********************************************************************************************************************************/
set nocount on;
with CTE_Base (Table_Name, Index_Name, Leftmost_Column_in_Index, Position_in_Index, Columns_in_the_Index, Index_Type, Is_Primary_Key, Is_The_Index_Unique, Fill_Factor_When_Created, File_Group_Name, Is_It_A_Filtered_Index, Filter_Definition, Is_An_Included_Column, Is_Compressed, Type_Of_Compression_Used)
as
( select C.Name Table_Name, b.name Index_Name
,col_name(b.[object_id], (select e.column_id from sys.index_columns e
where e.index_column_id = 1
and e.[object_id] = b.[object_id]
and e.index_id = b.index_id)) Leftmost_Column
,case
when @i_Column_Name is null then null
else a.key_ordinal
end Position_in_Index
,replace((select col_name(f.[object_id], f.column_id) + ' ###' 'data()'
from sys.index_columns f
where f.[object_id] = a.[object_id]
and f.index_id = a.index_id
order by f.key_ordinal asc
for XML Path ('')), '###', ',') Columns_in_the_Index
,case b.Type
when 0 then 'Heap'
when 1 then 'Clustered'
when 2 then 'Nonclustered'
when 3 then 'XML'
when 4 then 'Spatial'
end Index_Type
,case b.is_primary_key
when 1 then 'Yes'
else 'No'
end Is_Primary_Key
,case b.is_unique
when 1 then 'Yes'
else 'No'
end Is_The_Index_Unique
,b.fill_factor Fill_Factor_When_Created
,FILEGROUP_NAME (b.data_space_id) File_Group_Name
,case b.has_filter
when 1 then 'Yes'
else 'No'
end Is_It_A_Filtered_Index
,b.filter_definition
,case a.is_included_column
when 1 then 'Yes'
else 'No'
end Is_An_Included_Column
,case
when d.data_compression > 0 then 'Yes'
else 'No'
end Is_Compressed
,d.data_compression_desc Type_Of_Compression_Used
from sys.tables c left join sys.index_columns a
on c.[object_id] = a.[object_id]
inner join sys.indexes b
on a.[object_id] = b.[object_id]
and a.index_id = b.index_id
inner join sys.partitions d
on b.[object_id] = d.[object_id]
and b.index_id = d.index_id
where b.[type] in (1, 2)
and a.key_ordinal = isnull(@l_Column_Position_in_Index, a.key_ordinal)
and OBJECTPROPERTY (b.[object_id], 'IsSystemTable') = 0
and OBJECTPROPERTY (b.[object_id], 'IsTable') = 1
and col_name(b.[object_id], a.column_id) = ISNULL(@i_Column_Name, col_name(b.[object_id], a.column_id))
and c.name = ISNULL(@i_Table_Name, c.name))
select distinct Table_Name, Index_Name, Leftmost_Column_in_Index, Position_in_Index
,substring(Columns_in_the_Index, 1, LEN(Columns_in_the_Index) - 1) Columns_in_the_Index
,Index_Type, Is_The_Index_Unique, Fill_Factor_When_Created, File_Group_Name, Is_It_A_Filtered_Index, Filter_Definition, Is_An_Included_Column, Is_Compressed, Type_Of_Compression_Used
from CTE_Base
order by Table_Name
set nocount off;
go
For now, I have included only clustered and non-clustered index including filtered indexes.
This stored procedure basically uses a few catalogue views:
and a few system functions like:
The rest are all basic joins.
Now we will get to the usage of this stored procedure.
Usage
For simplicity, I have used Microsoft's Adventureworks database in the examples below.
Execution 1:
Please execute the following TSQL code without any input parameters:
execute dbo.dbp_Find_Index_Properties_by_Column
Figure 1
Results for all clustered and non-clustered indexes in the Adventureworks database are displayed.
Execution 2:
Please execute the following TSQL code. It will display the results for all tables and their indexes in the Adventureworks database the column AddressID is a part of:
execute dbo.dbp_Find_Index_Properties_by_Column null, 'AddressID', null
Figure 2
Execution 3:
Now execute the following TSQL statements to change the fillfactor on the index Purchasing.VendorAddress.IX_VendorAddress_AddressID from 80 to 50:
use Adventureworks
go
alter index IX_VendorAddress_AddressID on Purchasing.VendorAddress rebuild with (statistics_norecompute = on, fillfactor = 50, Online = ON, maxdop = 8, sort_in_tempdb = on, allow_page_locks = on)
go
Now execute the statement below to see if the change in fill factor for this index has been captured:
execute dbo.dbp_Find_Index_Properties_by_Column null, 'AddressID', null
Figure 3
As you have noticed in the screenshot above, the stored procedure displays 50 (Figure 3) intead of 80 (Figure 2) for the fill factor for this index.
Execution 4:
Now let us add a secondary filegroup named SecondaryFileGroup to the database Adventureworks, add a data file to this new filegroup, then drop the index Purchasing.VendorAddress.IX_VendorAddress_AddressID and finally recreate it on the new filegroup, SecondaryFileGroup using data compression and a filter on the AddressID column. The same is accomplished by executing the following code. Please note that you may have to change the path of the new data file in the code below as per you custom configured disc layout:
alter database Adventureworks add filegroup SecondaryFileGroup
go
alter database Adventureworks add file (Name = Adventureworks_2, filename = 'E:\Data3\MSSQL\Data\Adventureworks_2.ndf', size = 10MB, Maxsize = 100MB, filegrowth = 10%) to filegroup SecondaryFileGroup
go
drop index Purchasing.VendorAddress.IX_VendorAddress_AddressID
go
create index IX_VendorAddress_AddressID on Purchasing.VendorAddress (AddressID)
where AddressID is not null with
(statistics_norecompute = on, fillfactor = 50, data_compression = Page, Online = ON, maxdop = 8, sort_in_tempdb = on, allow_page_locks = on) on SecondaryFileGroup
go
Now execute the statement below to see if the changes for this index have been captured:
execute dbo.dbp_Find_Index_Properties_by_Column null, 'AddressID', null
Figure 4
The changes have been captured.
Execution 5:
Lets check to see on which tables the column, AddressID is being used in an index while occurring at the second position.
Please execute the statement below to find this out:
execute dbo.dbp_Find_Index_Properties_by_Column null, 'AddressID', 2
Figure 5
AddressID occurs as the second column in the three indexes on the three tables shown in Figure 5 above.
Summary:
The stored procedure dbo.dbp_Find_Index_Properties_by_Column can be used to find various aspects of how a column is involved within an index. This SP also provides details on various indexes which can then be used to refine existing indexing strategy based on various application stored procedures. I haven't delved deeper into performance optimization with respect to indexing strategy in this article. I have only attempted to provide an easier starting piont for a preliminary analysis.
As usual, please let me know of your thoughts and comments.
As DBA's, we encounter long running queries that end-users compalin about. Indexing strategy is one area of database where developers usually tend to create indexes on the fly without much consideration for or analysis on how an index that helps one query marginally can slow down a bunch of other queries or stored procedures drastically. Some (not all) of the situations that we usually come across are following:
- Dbo.Orders table is being joined to dbo.Order_Details table but both of them exist on the same filegroup and on the same set of underlying discs. This would cause disc queuing and the consequent slowness.
- Dbo.Orders table is being joined to dbo.Order_Details table on the common column Product_ID but Product_ID is the leftmost colmn of a non-clustered index on dbo.Orders but is the third column in another non-clustered index on dbo.Order_Details. So these indexes don't help this join much.
- Updates are taking place on a column that has a non-clustered index that is compressed. It would be good not to have an index at all on a column being updated. In case such an index is really required, not compressing it would be beneficial. Otherwise the update is much slower than when the index is uncompressed.
- A majority of values in a particular column used in the WHERE clause of a query have a value but the query is designed to fetch only such rows where the value is null. May be a filtered-index would have helped speed it up.
- Whether the two tables being joined are physically resident on different filegroups.
- The columns being joined are the leftmost columns in their respective indexes.
- The column being updated is not part of an index (first of all) that is compressed in any way (definitely not).
- In case, a query is looking for a small subset of values from a column, is the column a part of an index that has the appropriate filter defined on the index.
- Etc.
Details of the Stored procedure
The text of the store procedure is given below:
If exists (select 1 from sys.procedures
where USER_NAME(schema_id) = 'dbo'
and name = 'dbp_Find_Index_Properties_by_Column'
and type = 'P')
begin
drop procedure dbo.dbp_Find_Index_Properties_by_Column
end
go
create procedure dbo.dbp_Find_Index_Properties_by_Column
@i_Table_Name varchar(64) = null
,@i_Column_Name varchar(24) = null
,@l_Column_Position_in_Index tinyint = null
as
*******************************************************************************************************************************
Created by: Suresh K Maganti
Purpose : Find if a column is used in an index. View additional properties of the index.
For now, I have included only clustered and non-clustered indexes.
Input
Parameters: If provided, the result set is appropriately filtered. Else, the complete result set is returned.
********************************************************************************************************************************/
set nocount on;
with CTE_Base (Table_Name, Index_Name, Leftmost_Column_in_Index, Position_in_Index, Columns_in_the_Index, Index_Type, Is_Primary_Key, Is_The_Index_Unique, Fill_Factor_When_Created, File_Group_Name, Is_It_A_Filtered_Index, Filter_Definition, Is_An_Included_Column, Is_Compressed, Type_Of_Compression_Used)
as
( select C.Name Table_Name, b.name Index_Name
,col_name(b.[object_id], (select e.column_id from sys.index_columns e
where e.index_column_id = 1
and e.[object_id] = b.[object_id]
and e.index_id = b.index_id)) Leftmost_Column
,case
when @i_Column_Name is null then null
else a.key_ordinal
end Position_in_Index
,replace((select col_name(f.[object_id], f.column_id) + ' ###' 'data()'
from sys.index_columns f
where f.[object_id] = a.[object_id]
and f.index_id = a.index_id
order by f.key_ordinal asc
for XML Path ('')), '###', ',') Columns_in_the_Index
,case b.Type
when 0 then 'Heap'
when 1 then 'Clustered'
when 2 then 'Nonclustered'
when 3 then 'XML'
when 4 then 'Spatial'
end Index_Type
,case b.is_primary_key
when 1 then 'Yes'
else 'No'
end Is_Primary_Key
,case b.is_unique
when 1 then 'Yes'
else 'No'
end Is_The_Index_Unique
,b.fill_factor Fill_Factor_When_Created
,FILEGROUP_NAME (b.data_space_id) File_Group_Name
,case b.has_filter
when 1 then 'Yes'
else 'No'
end Is_It_A_Filtered_Index
,b.filter_definition
,case a.is_included_column
when 1 then 'Yes'
else 'No'
end Is_An_Included_Column
,case
when d.data_compression > 0 then 'Yes'
else 'No'
end Is_Compressed
,d.data_compression_desc Type_Of_Compression_Used
from sys.tables c left join sys.index_columns a
on c.[object_id] = a.[object_id]
inner join sys.indexes b
on a.[object_id] = b.[object_id]
and a.index_id = b.index_id
inner join sys.partitions d
on b.[object_id] = d.[object_id]
and b.index_id = d.index_id
where b.[type] in (1, 2)
and a.key_ordinal = isnull(@l_Column_Position_in_Index, a.key_ordinal)
and OBJECTPROPERTY (b.[object_id], 'IsSystemTable') = 0
and OBJECTPROPERTY (b.[object_id], 'IsTable') = 1
and col_name(b.[object_id], a.column_id) = ISNULL(@i_Column_Name, col_name(b.[object_id], a.column_id))
and c.name = ISNULL(@i_Table_Name, c.name))
select distinct Table_Name, Index_Name, Leftmost_Column_in_Index, Position_in_Index
,substring(Columns_in_the_Index, 1, LEN(Columns_in_the_Index) - 1) Columns_in_the_Index
,Index_Type, Is_The_Index_Unique, Fill_Factor_When_Created, File_Group_Name, Is_It_A_Filtered_Index, Filter_Definition, Is_An_Included_Column, Is_Compressed, Type_Of_Compression_Used
from CTE_Base
order by Table_Name
set nocount off;
go
For now, I have included only clustered and non-clustered index including filtered indexes.
This stored procedure basically uses a few catalogue views:
- sys.partitions
- sys.tables
- sys.index_columns
- sys.indexes
and a few system functions like:
- OBJECTPROPERTY
- FILEGROUP_NAME
- COL_NAME
The rest are all basic joins.
Now we will get to the usage of this stored procedure.
Usage
For simplicity, I have used Microsoft's Adventureworks database in the examples below.
Execution 1:
Please execute the following TSQL code without any input parameters:
execute dbo.dbp_Find_Index_Properties_by_Column
Figure 1
Results for all clustered and non-clustered indexes in the Adventureworks database are displayed.
Execution 2:
Please execute the following TSQL code. It will display the results for all tables and their indexes in the Adventureworks database the column AddressID is a part of:
execute dbo.dbp_Find_Index_Properties_by_Column null, 'AddressID', null
Figure 2
Execution 3:
Now execute the following TSQL statements to change the fillfactor on the index Purchasing.VendorAddress.IX_VendorAddress_AddressID from 80 to 50:
use Adventureworks
go
alter index IX_VendorAddress_AddressID on Purchasing.VendorAddress rebuild with (statistics_norecompute = on, fillfactor = 50, Online = ON, maxdop = 8, sort_in_tempdb = on, allow_page_locks = on)
go
Now execute the statement below to see if the change in fill factor for this index has been captured:
execute dbo.dbp_Find_Index_Properties_by_Column null, 'AddressID', null
Figure 3
As you have noticed in the screenshot above, the stored procedure displays 50 (Figure 3) intead of 80 (Figure 2) for the fill factor for this index.
Execution 4:
Now let us add a secondary filegroup named SecondaryFileGroup to the database Adventureworks, add a data file to this new filegroup, then drop the index Purchasing.VendorAddress.IX_VendorAddress_AddressID and finally recreate it on the new filegroup, SecondaryFileGroup using data compression and a filter on the AddressID column. The same is accomplished by executing the following code. Please note that you may have to change the path of the new data file in the code below as per you custom configured disc layout:
alter database Adventureworks add filegroup SecondaryFileGroup
go
alter database Adventureworks add file (Name = Adventureworks_2, filename = 'E:\Data3\MSSQL\Data\Adventureworks_2.ndf', size = 10MB, Maxsize = 100MB, filegrowth = 10%) to filegroup SecondaryFileGroup
go
drop index Purchasing.VendorAddress.IX_VendorAddress_AddressID
go
create index IX_VendorAddress_AddressID on Purchasing.VendorAddress (AddressID)
where AddressID is not null with
(statistics_norecompute = on, fillfactor = 50, data_compression = Page, Online = ON, maxdop = 8, sort_in_tempdb = on, allow_page_locks = on) on SecondaryFileGroup
go
Now execute the statement below to see if the changes for this index have been captured:
execute dbo.dbp_Find_Index_Properties_by_Column null, 'AddressID', null
Figure 4
The changes have been captured.
Execution 5:
Lets check to see on which tables the column, AddressID is being used in an index while occurring at the second position.
Please execute the statement below to find this out:
execute dbo.dbp_Find_Index_Properties_by_Column null, 'AddressID', 2
Figure 5
AddressID occurs as the second column in the three indexes on the three tables shown in Figure 5 above.
Summary:
The stored procedure dbo.dbp_Find_Index_Properties_by_Column can be used to find various aspects of how a column is involved within an index. This SP also provides details on various indexes which can then be used to refine existing indexing strategy based on various application stored procedures. I haven't delved deeper into performance optimization with respect to indexing strategy in this article. I have only attempted to provide an easier starting piont for a preliminary analysis.
As usual, please let me know of your thoughts and comments.