After running a fairly hefty query, the execution plan gave me a missing index suggestion which was of the form:
(Timestamp) INCLUDE (CustomerID, EventID, ID, EmployeeID)
Which seems to be a covering index (the INCLUDE column are all either primary keys (ID) or foreign keys). However, my querys WHERE clause is filtering by Timestamp, CustomerID, and EventID. I don't know why these weren't included in the main part of the index.
So my question is, is there any difference in using the suggested index above, or what I think is a better alternative;
(Timestamp, CustomerID, EventID) INCLUDE (ID, EmployeeID)
My understanding is that this will still allow Timestamp-only index seeking, but will also further assist my query by having the customer and event IDs (which are filtered) in the main part.
I think this was something to do with the width of the 'main' part - FYI, Timestamp is a datetime2(0), CustomerID is an int, and EventID is a byte.
I am testing this myself at the moment, but this is a HUGE table - over 1,000,000,000 rows - and it is taking time to compare the indexes. That, and I'd like to learn more about this.
Timestamp is very selective, possible even unique. As such there is no point in adding other fields to the key, they will only increase the size of the key without contributing to the selectivity. Having them as INCLUDED columns instead allows them to be added only to the leaf pages, saving on the overall size of the index.
select top(1000) count(*) as cnt, Timestamp from ... group by Timestamp order by cnt desc;
What does the above return?
So my question is, is there any difference in using the suggested index above, or what I think is a better alternative...
The missing-index suggestions made by the optimizer are opportunistic and relevant only to the particular query concerned. The optimizer goes through an index analysis phase, where it might note the absence of a covering index it didn't find. These suggestions are not intended to be a replacement for a full workload-representative DTA session, much less proper index design by a skilled database practitioner based on wide knowledge of the data and critical queries.
The suggestions should always be reviewed, as you have done, to ensure an optimal set of indexes for all queries is created - not one covering index per query as could be the case if the suggestions were followed literally.
There are naturally implications when widening the keys of an index compared with using
INCLUDE column, some of which have been noted by others. I personally prefer to
INCLUDE the clustering keys explicitly where they are useful. Clustered indexes can be changed, and it is rare that the person performing this change would check to see if any queries were relying on the implicit behaviour.
Changing columns from
INCLUDE to keys may also affect update query plans (overall shape and Halloween Protection requirements) and there are logging implications where keys of an index might change too.
I would probably choose to modify the suggestion as you have done, but I would be careful to validate update (= insert/update/delete/merge) query plans for the affected table.