Wednesday, February 27, 2013

"Cost-free" joins - 1

Recently I came across some interesting edge cases regarding the costing of joins. They all have in common that they result in (at first sight) unexpected execution plans, but only some of them are actual threats to performance.

Outer Joins


The first one is about outer joins with an extreme data distribution. Consider the following data setup:

create table t1
as
select
        rownum as id
      , rpad('x', 100) as filler
      , case when rownum > 1e6 then rownum end as null_fk
from
        dual
connect by
        level <= 1e6
;

exec dbms_stats.gather_table_stats(null, 't1')

create table t2
as
select
        rownum as id
      , rpad('x', 100) as filler
from
        dual
connect by
        level <= 1e6
;

exec dbms_stats.gather_table_stats(null, 't2')

create /*unique*/ index t2_idx on t2 (id);

The following query is a simple outer join between T1 and T2 and the default, unhinted execution plan that I get from 11.2.0.1 (11.1.0.7 and 10.2.0.4 show similar results):

select
        t1.filler as t1_filler
      , t2.filler as t2_filler
from
        t1
      , t2
where
        t1.null_fk = t2.id (+)
;

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |  1000K|   202M|  4204   (1)| 00:00:51 |
|   1 |  NESTED LOOPS OUTER          |        |  1000K|   202M|  4204   (1)| 00:00:51 |
|   2 |   TABLE ACCESS FULL          | T1     |  1000K|   101M|  4202   (1)| 00:00:51 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T2     |     1 |   106 |     2   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | T2_IDX |     1 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("T1"."NULL_FK"="T2"."ID"(+))

The optimizer preferred a Nested Loop join albeit the fact that the number of estimated loop iterations is pretty large. Notice in particular the cost column: Although the inner rowsource is estimated to be started 1000K times, the cost of doing so corresponds to just a single execution.

For reference here is a cost estimate for a similar operation that corresponds to the expected costing model:

select  /*+ use_nl(t1 t2) */
        t1.filler as t1_filler
      , t2.filler as t2_filler
from
        t1
      , t2
where
        t1.id = t2.id (+)
;

---------------------------------------------------------------------------------------
| Id  | Operation                    | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |        |  1000K|   202M|  3006K  (1)| 10:01:21 |
|   1 |  NESTED LOOPS OUTER          |        |  1000K|   202M|  3006K  (1)| 10:01:21 |
|   2 |   TABLE ACCESS FULL          | T1     |  1000K|   101M|  4200   (1)| 00:00:51 |
|   3 |   TABLE ACCESS BY INDEX ROWID| T2     |     1 |   106 |     3   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN          | T2_IDX |     1 |       |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("T1"."ID"="T2"."ID"(+))

I now had to force the Nested Loop join via a hint, because by default other join methods were preferred by the optimizer. The cost of a single iteration of the loop has now increased to 3, although the access is exactly the same as before (T2 random table access via index lookup of T2.ID), and the cost of the Nested Loop join corresponds to the known formula: Estimated starts times the cost of a single iteration, which is 3000K in this case here, plus the 4200 of the Full Table Scan for accessing the outer row source, plus CPU costing overhead.

So what makes the difference between the two? It's the data. The column name chosen for the column in T1 already reveals what's so special: The join column used (NULL_FK) is actually NULL for all rows.

The optimizer takes this into account and assumes that none of those rows from the driving row source will actually match a row of the inner row source - in fact the lookup to the inner row source could be short-circuited in some way, since a NULL value by definition isn't supposed to find a match for this join. I haven't investigated to what extent the runtime engine does this, however in the Rowsource Statistics the inner row source is started the expected number of times, although no logical I/O is recorded for it, but some CPU time, so at least some work seems to be done there.

Modifying the test case so that more of the FKs are actually non-null shows that the cost calculation is scaled accordingly. In fact the cost calculation is more or less the same than that of a corresponding inner join that could filter out those driving rows with NULL values in the join column.

The overall performance of the execution plan is quite decent, so although it looks quite unusual it performs pretty well.

In the second part I'll show another interesting, unexpected join execution plan that however can cause real performance problems.

Wednesday, February 20, 2013

QB_NAME hint query block name length limitation

Oracle 10g introduced the QB_NAME hint that can come handy in case hints need to be applied to more complex statements, in particular when possibly multiple layers of views / subqueries are involved.

Jonathan Lewis has a older blog post that describes more details.

Just in case you wonder why sometimes apparently the QB_NAME hint - along with all other hints that refer to the assigned query block name - seems to be ignored: One possible reason is that it looks like there is an undocumented length limitation of the query block names that can be assigned - 20 characters seem to be the maximum possible (I haven't checked the effect of multi-byte database character sets).

Consider this simple example:

drop table t1;

purge table t1;

create table t1
as
select
        rownum as id
      , rpad('x', 100) as filler
from
        dual
connect by
        level <= 100
;

exec dbms_stats.gather_table_stats(null, 't1')

By default Oracle 11.2.0.1 unnests the correlated subquery and runs an anti-join for the following simple query:

select 
       *
from
       t1
where
       not exists
       (
         select 
                null
         from
                t1 t1_inner
         where
                t1_inner.id = t1.id
       );

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|*  1 |  HASH JOIN ANTI    |      |
|   2 |   TABLE ACCESS FULL| T1   |
|   3 |   TABLE ACCESS FULL| T1   |
-----------------------------------
 
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
 
   1 - SEL$5DA710D3
   2 - SEL$5DA710D3 / T1@SEL$1
   3 - SEL$5DA710D3 / T1_INNER@SEL$2

Notice the aliases and query block names assigned, like SEL$1, SEL$2 etc.

Using the QB_NAME hint the NO_UNNEST hint can be applied to the correlated subquery from the outer query block:

select /*+ no_unnest(@nested_query) */
       *
from
       t1
where
       not exists
       (
         select /*+ qb_name(nested_query) */
                null
         from
                t1 t1_inner
         where
                t1_inner.id = t1.id
       );

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|*  1 |  FILTER            |      |
|   2 |   TABLE ACCESS FULL| T1   |
|*  3 |   TABLE ACCESS FULL| T1   |
-----------------------------------
 
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
 
   1 - SEL$1       
   2 - SEL$1        / T1@SEL$1
   3 - NESTED_QUERY / T1_INNER@NESTED_QUERY

The correlated subquery is now executed using a FILTER operator as requested. Notice the aliases and query block names, in particular for the instance of T1 at operation id = 3

Now if I repeat the same query, but use a longer query block name, the hints are effectively ignored and the default unnesting takes place again:

select /*+ no_unnest(@nested_query_longer_name) */
       *
from
       t1
where
       not exists
       (
         select /*+ qb_name(nested_query_longer_name) */
                null
         from
                t1 t1_inner
         where
                t1_inner.id = t1.id
       );

-----------------------------------
| Id  | Operation          | Name |
-----------------------------------
|   0 | SELECT STATEMENT   |      |
|*  1 |  HASH JOIN ANTI    |      |
|   2 |   TABLE ACCESS FULL| T1   |
|   3 |   TABLE ACCESS FULL| T1   |
-----------------------------------
 
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
 
   1 - SEL$5DA710D3
   2 - SEL$5DA710D3 / T1@SEL$1
   3 - SEL$5DA710D3 / T1_INNER@SEL$2

Notice how the default aliases and query block names get used although explicitly hinted.

It is probably rather uncommon to use such lengthy query block names, nevertheless it can be puzzling when hitting such an undocumented limitation.