Index Access Retrieval
The cost of an indexed access retrieval depends on the relational expression on which the access is based. The cost estimate computations for the each of the optimizable relational expressions are as follows.
- Equality Conditionals
Indexed access retrieval allows retrieval of an individual row or set of matching rows, based on the value of one or more columns contained in a single index. These values can be specified in the query directly or through a join predicate.
For a unique index, the cost to access a single row is equal to the depth of the index's B-tree (seldom more than 4 ) + 1 (to read the row from the data file). For a non-unique index, the cost is based on an estimate of the average number of rows having the same index value derived from number of distinct column values. The percentage of the table's rows that match the specified equality constraint is the restriction factor (R). Thus, the estimate of number of matching rows is equal to the cardinality of the table multiplied by the restriction factor, or:
number of matching rows = C * R
The cost estimate (in logical page reads) of an indexed access retrieval is equal to the number of index pages that must be accessed plus the number of matching rows (1 logical page read per row), or:
Eeq = Cost of index access for column = value
= D + (C * R)/(.7 * K) + (C * R)
This assumes that each index page is an average of 70% full (D = depth of B-tree, K = maximum number of keys per index page). Note that this formula works for both unique and non-unique indexes (for unique indexes, R = 1/C).
- IN Conditionals
When the IN operator is used, the restriction factor is equal to the sum of the equality restriction factors for each of the listed values. Thus, the cost is simply the sum of the costs of the individual values.
Elist = Cost of index access for column in (v1, v2, ..., vn)
= SUM(cost(column = vi)) for all i: 1..n
- Inequality Conditionals
Indexed scans use an index to access the rows satisfying an inequality relational expression involving the major column in the index. The estimate of the cost of an index scan is calculated exactly the same as the indexed access method. The restriction factor is calculated as given in Table 2 and Table 3.
Eineq = Cost of index access for inequality relational expressions
= D + (C * R)/(.7 * K) + (C * R)
- LIKE Conditionals
Index scans are also used to access rows satisfying LIKE expressions that compare the major column of an index with a literal string pattern. The restriction factor is calculated from the histogram values that match the specified pattern. If no matches are found, it is then calculated from average of the five lowest frequency counts in the histogram.
Two types of scans are employed depending on the position of the first wild card character (for example, "%" or "_") in the pattern. If the pattern starts with a wild card character, the entire index will be scanned and each key value will be compared with the specified pattern. Only those keys that match will be returned. The cost of this scan is equal to the cost of reading each index page plus the cost of reading the row associated with each matching index value as given in the following formula:
Elike-full = D + (C/(.7 * K)) + (R * C)
The cost typically will be much less when the pattern does not begin with a wild card. This allows the SQL system to position within the index to those values having the same prefix (consisting of all characters up to the first wild card).
Elike-prefix = D + (C * R)/(.7 * K) + (C * R)
Note that this is identical to the cost of an equality indexed access (although the restriction factor will be greater in this case).