Monday, June 6, 2011

Transitive Closure - Outer Joins

The Cost Based Optimizer (CBO) supports since at least Oracle 9i the automatic generation of additional predicates based on transitive closure.

In principle this means:

If a = b and b = c then the CBO can infer a = c


As so often with these optimizations the purpose of these automatically generated additional predicates is to allow the optimizer finding potentially more efficient access paths, like an index usage or earlier filtering reducing the amount of data to process.

So far I was aware of such additional predicates only when literals were involved, so if "a = 10 and a = b" then Oracle will automatically add "b = 10" (and in Oracle 9i remove actually the "a = b" predicate so you end up with "a = 10 and b = 10" but no longer "a = b", see Jonathan Lewis on transitive closure and cartesian merge join).

According to MOS document "Transitivity and Transitive Closure [ID 68979.1]" Oracle 11g is supposed to support three kinds of transitive closure. In addition to the already mentioned join / literal predicate case these are the transitivity of join predicates without any literals involved ("a = b and b = c" then "a = c") and the literal predicate case applied to outer joins ("a = 10 and a = b(+)" then "b(+) = 10" which means that the filter will be applied logically to b before the join, not after).

However I couldn't witness yet any working example of the pure join predicate transitive closure mentioned in the MOS article.

On the OTN forums there was recently a question about partition pruning not taking place but the case there really turns out to be about transitive closure and outer joins (although the partitioning and parallel execution involved in the thread probably helps to confuse the issue).

It looks like that Oracle does not apply transitive closure across outer joins in the sense of:

a = 10 and a = b(+) and b = c(+)

then Oracle will add automatically the "b(+) = 10" predicate as outlined above but it will not add "c(+) = 10" across the second outer join (it would add a "c = 10" predicate if this were inner joins).

There might be a valid reason to prevent this from happening in general (the most appealing to me seems that "a = b (+) and b = c(+)" is not equal to "a = b(+) and a = c(+)") but at least in this particular case I don't see why the "c(+) = 10" predicate should not be added automatically.

If anyone sees a valid explanation for this behaviour (viz: a general rule that gets violated when doing so) I'm open for suggestions.

If the expression is changed into

a = 10 and a = b(+) and a = c(+)

then Oracle happily adds both the b(+) = 10 and c(+) = 10 predicates. Of course you will appreciate that the changed expression is semantically not the same as the previous one and the results might differ.

The case on the OTN thread is actually a bit more interesting since it includes an outer join from one table to two other tables which Oracle in general does not support directly using its native Oracle joins but only with the help of LATERAL views - there is a good explanation here by the Oracle Optimizer Group.

The interesting part stripped to a bare minimum looks like this:


set echo on

drop table t;

purge table t;

create table t
/*
partition by list (pkey)
(
partition p_1 values (1)
, partition p_2 values (2)
, partition p_3 values (3)
)
*/
as
select
rownum as id
, mod(rownum, 3) + 1 as pkey
, rownum as id2
, rpad('x', 100) as filler
from
dual
connect by
level <= 30000
;

exec dbms_stats.gather_table_stats(null, 't')

alter session set tracefile_identifier = 'trans_closure_outer_join';

alter session set events '10053 trace name context forever, level 1';

explain plan for
select
*
from
t t1
left outer join
(select * from t where id >= 4000) t2
on
t1.id = t2.id
and t1.pkey = t2.pkey
left outer join
(select * from t where id >= 2000) t3
on
t2.id2 = t3.id2
and t1.pkey = t3.pkey
where
t1.pkey = 2
order by
t1.id
;

alter session set events '10053 trace name context off';


So we outer join T3 to T1 and T2. Although the PKEY predicate is actually coming from T1 and therefore on its own would definitely qualify for transitive closure even with the documented restriction in place (that would apply if t2.pkey = t3.pkey was used instead) the additional predicate doesn't get generated obviously due to the additional outer join between T2 and T3 (and we end up with a lateral view due to this when looking into the execution plan and generated CBO trace file).

No comments:

Post a Comment