Sorting and Grouping Operations

For SELECT statements that include a GROUP BY or ORDER BY specification, the SQL optimizer performs two separate optimization passes. The first pass restricts the choice of usable access methods to only those that produce or maintain the specified ordering. For example, an index scan retrieves its results in the order specified in the CREATE INDEX declaration. If the results match the specified ordering, they are included as a usable access method. This optimization pass is fast because, typically, very few plans produce the desired ordering without performing an external sort of the result set.

If a plan is produced by the first pass, it is saved (along with its cost estimate), and a second optimization is performed without the ordering restriction. An estimate of the cost required to sort the result set, based on the optimizer's estimate of the result set's size, is added to the cost of the plan produced by the unrestricted pass. From the two plans, the optimizer will choose the one with the lowest cost.

The estimate of the sort cost is based on the optimizer's cardinality estimate, the length of the sort key, and the sort index page size. The optimizer will calculate the number of I/Os as two times the number of index pages to store the sort index (one pass to create the page and another to read each page in order) and add the number of result rows.

Note that if both the GROUP BY and ORDER BY clauses are specified, only the GROUP BY ordering can be satisfied by existing indexes and joins. A separate sort of the result set will always be required for the ORDER BY clause. If there is no index to satisfy the specified GROUP BY, then two sort passes will be needed.

A user can force the optimizer to always select the restricted ordering plan by specifying USING KEY at the end of the ORDER BY or GROUP BY clause. Thus, if a restricted order plan exists and USING KEY is specified, that plan will be executed. If not, an external sort of the result set may still be performed.