Saturday, December 11, 2010

"Advanced Oracle Troubleshooting - Live Session" presentation details

As promised during my recent public appearances at SIOUG, DOAG and UKOUG here are the presentation details for my "Advanced Oracle Troubleshooting - Live Session", in particular the scripts used for this session.

You can download the presentation slides here, and the scripts archive here.

The scripts archive contains again the presentation slides, along with two subdirectories, one containing the demo scripts used in the presentation, and another subdirectory with the tools demonstrated for troubleshooting.

There is a README file in the top directory of the archive that explains the prerequisites and the installation steps along with other details that are required for reproducing the demos, in particular which scripts need to be modified depending on the configuration and environment.

Each script is documented with a header comment that explain further details where required.

You can use the presentation slides to see which scripts have been used for demonstration by checking the script names on the corresponding "Demo Time!" slides.

Happy troubleshooting!

Sunday, November 7, 2010

Public Appereances - Past and Future

The Michigan OakTable Symposium (MOTS) in September has been a great event with so many outstanding speakers. It has been a pleasure to spend time with my fellow Oakies and meet many of them in person for the first time. My own two presentations about the Cost Based Optimizer ("Understanding System Statistics" and "Everything you wanted to know about FIRST_ROWS_N") were received pretty well I think.

My visit to the SIOUG 2010 was also very nice, in particular meeting again with Christian Antognini and Joze Senegazcnik.

I'll be presenting at DOAG 2010 mid of November in Nuremberg. This time my presentation will be given in German, which is interesting since I do my presentations mostly in English even here in Germany since there always seem to be some attendees that don't speak German...

My presentation is scheduled at 4 pm on Wednesday, the 17th November, in the room "St. Petersburg", and is called "Advanced Oracle Troubleshooting - Live Session". In this session I'll walk through some common and less common troubleshooting scenarios, in particular some scenerios where the usually helpful wait interface / session statistics / ASH / ADDM approach tends to be useless. This is mostly a live demonstration show along with a few slides describing the issue and the techniques applied.

I'll also be presenting at UKOUG 2010 end of November / beginning of December in Birmingham. I'll be doing again my "Understanding the different modes of System Statistics" and also give the "Advanced Oracle Troubleshooting - Live Session" presentation, this time however that presentation will be split into Part 1 and Part 2 each taking 60 min. which allows me to show even more troubleshooting cases and also spent more time on the "basic" troubleshooting skills which are probably more relevant in everyday life than the interesting but rare "advanced" stuff.

Sunday, October 31, 2010

11.2 New Features - Subtle Restrictions

This is just a short note that some of the new features added in 11.2 have some subtle restrictions that are not documented properly.

1. Hybrid Columnar Compression and Function-Based Indexes

The Hybrid Columnar Compression (HCC) introduced in 11.2 that is only enabled in tablespaces residing on Exadata Storage in 11.2.0.1 has the subtle restriction that the mere presence of a function-based index either explicitly created or indirectly via an index on a virtual column disables HCC completely on that segment with direct path / parallel DML inserts. It silently falls back to no compression at all, rather than using for example basic or advanced oltp compression instead.

This can be easily reproduced using the following simple test case:


set echo on timing on

spool hcc_function_based_indexes_testcase.log

drop table t_hcc_fbi purge;

-- Create table with HCC enabled
create table t_hcc_fbi compress for query low as select * from dba_objects where 1 = 2;

truncate table t_hcc_fbi;

-- Direct path inserts support HCC by default
insert /*+ append */ into t_hcc_fbi select * from dba_objects;

commit;

-- This is supposed to return DBMS_COMPRESSION.COMP_FOR_QUERY_LOW = 8
select distinct dbms_compression.get_compression_type(USER, 'T_HCC_FBI', rowid) from t_hcc_fbi;

-- So it is supported to have an FBI on HCC data
create index t_hcc_fbi_idx on t_hcc_fbi (lower(object_type));

truncate table t_hcc_fbi;

-- Let's make the FBI unusable, so it doesn't have to be maintained by the load
alter index t_hcc_fbi_idx unusable;

insert /*+ append */ into t_hcc_fbi select * from dba_objects;

commit;

-- But this will return DBMS_COMPRESSION.COMP_NOCOMPRESS = 1
select distinct dbms_compression.get_compression_type(USER, 'T_HCC_FBI', rowid) from t_hcc_fbi;

-- So the mere existence of a FBI, no matter if usable or unusable will prevent HCC with direct-path inserts / parallel DML
-- It is however fully supported to have them on HCC data, but you need to drop / re-create them in order to load the data
-- via direct-path inserts / parallel DML with HCC enabled

spool off


And this is the output I get from 11.2.0.1:


SQL>
SQL> drop table t_hcc_fbi purge;

Table dropped.

Elapsed: 00:00:00.13
SQL>
SQL> -- Create table with HCC enabled
SQL> create table t_hcc_fbi compress for query low as select * from dba_objects where 1 = 2;

Table created.

Elapsed: 00:00:00.09
SQL>
SQL> truncate table t_hcc_fbi;

Table truncated.

Elapsed: 00:00:00.01
SQL>
SQL> -- Direct path inserts support HCC by default
SQL> insert /*+ append */ into t_hcc_fbi select * from dba_objects;

72545 rows created.

Elapsed: 00:00:01.24
SQL>
SQL> commit;

Commit complete.

Elapsed: 00:00:00.00
SQL>
SQL> -- This is supposed to return DBMS_COMPRESSION.COMP_FOR_QUERY_LOW = 8
SQL> select distinct dbms_compression.get_compression_type(USER, 'T_HCC_FBI', rowid) from t_hcc_fbi;

DBMS_COMPRESSION.GET_COMPRESSION_TYPE(USER,'T_HCC_FBI',ROWID)
-------------------------------------------------------------
8

Elapsed: 00:00:09.51
SQL>
SQL> -- So it is supported to have an FBI on HCC data
SQL> create index t_hcc_fbi_idx on t_hcc_fbi (lower(object_type));

Index created.

Elapsed: 00:00:00.23
SQL>
SQL> truncate table t_hcc_fbi;

Table truncated.

Elapsed: 00:00:00.03
SQL>
SQL> -- Let's make the FBI unusable, so it doesn't have to be maintained by the load
SQL> alter index t_hcc_fbi_idx unusable;

Index altered.

Elapsed: 00:00:00.02
SQL>
SQL> insert /*+ append */ into t_hcc_fbi select * from dba_objects;

72546 rows created.

Elapsed: 00:00:01.01
SQL>
SQL> commit;

Commit complete.

Elapsed: 00:00:00.01
SQL>
SQL> -- But this will return DBMS_COMPRESSION.COMP_NOCOMPRESS = 1
SQL> select distinct dbms_compression.get_compression_type(USER, 'T_HCC_FBI', rowid) from t_hcc_fbi;

DBMS_COMPRESSION.GET_COMPRESSION_TYPE(USER,'T_HCC_FBI',ROWID)
-------------------------------------------------------------
1

Elapsed: 00:00:08.93
SQL>


I've been told that this restriction is going to be lifted in some future version, but I haven't had the chance yet to test this with the 11.2.0.2 patch set, in particular because the patch set is not yet released for existing Exadata installations as upgrade - it is only available for new Exadata deployments.

2. Deferred Segment Creation and Parallel DML

11.2 introduced the deferred segment creation that interestingly is enabled by default when doing a fresh installation. I haven't done a migration from previous versions to 11.2 yet so I don't know if this is also enabled in migrated environments but I assume so.

Note that the 11.2.0.2 patch set extends the deferred segment creation to partitioned objects which wasn't supported in the initial 11.2.0.1 release.

The deferred segment creation has some subtle side effects, for example sequences seem not to start with the defined START WITH value because the recursive transaction fails initially, the segment gets created and the next sequence value will be used for the actual DML insert operation.

Another side effect that is not properly documented is the fact that parallel DML is not supported on "segment-less" objects.

So if you have a freshly created segment that is supposed to be populated via parallel DML inserts then this will silently fall back to serial direct-path inserts for the first time executed.

Once the segment has been created from the next time on parallel DML is going to be used which might help to confuse the issue since it is only reproducible under particular circumstances.

So if you have migrated an application to 11.2 and wonder why sometimes load operations take significantly longer than before but mostly do not then this might be a viable explanation.

This behaviour is not part of the deferred segment creation restrictions described in the official documentation, but it is mentioned towards the end of the MOS note "11.2 Database New Feature Deferred Segment Creation [Video] [ID 887962.1]"

According to the mentioned MOS note this still seems to apply to 11.2.0.2, but I haven't tested this yet myself. So it looks like this significant restriction is not yet lifted with the current patch set.

Update July 2011: See comment below by Radoslav, the problem seems to be fixed in 11.2.0.2.

Here is a simple test case demonstrating the issue:


set echo on timing on

set linesize 160 trimspool on tab off pagesize 0

spool deferred_segment_creation_parallel_dml_testcase.log

drop table t_source purge;

drop table t_deferred_parallel_dml purge;

-- Create a simple table as source for the insert
create table t_source parallel nologging
as
select * from dba_objects;

exec dbms_stats.gather_table_stats(null, 't_source')

-- No AUTO DOP for this session to prevent any side-effects of this feature
alter session set parallel_degree_policy = manual;

-- Note that even a CTAS operation can be segment-less
-- if the query doesn't return any rows
create table t_deferred_parallel_dml
segment creation deferred
as
select * from t_source where 1 = 2;

-- No segment although CTAS has been used
select count(*) from user_segments where segment_name = 'T_DEFERRED_PARALLEL_DML';

alter session enable parallel dml;

explain plan for
insert /*+ append parallel(t) */ into t_deferred_parallel_dml t select * from t_source;

-- This plan does not show parallel DML
select * from table(dbms_xplan.display);

drop table t_deferred_parallel_dml purge;

create table t_deferred_parallel_dml
segment creation immediate
as
select * from t_source where 1 = 2;

-- One segment
select count(*) from user_segments where segment_name = 'T_DEFERRED_PARALLEL_DML';

explain plan for
insert /*+ append parallel(t) */ into t_deferred_parallel_dml t select * from t_source;

-- This plan does show parallel DML
select * from table(dbms_xplan.display);

spool off


And this is the output I get from 11.2.0.1:


SQL>
SQL> drop table t_source purge;

Table dropped.

Elapsed: 00:00:00.04
SQL>
SQL> drop table t_deferred_parallel_dml purge;

Table dropped.

Elapsed: 00:00:00.01
SQL>
SQL> -- Create a simple table as source for the insert
SQL> create table t_source parallel nologging
2 as
3 select * from dba_objects;

Table created.

Elapsed: 00:00:03.32
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 't_source')

PL/SQL procedure successfully completed.

Elapsed: 00:00:00.64
SQL>
SQL> -- No AUTO DOP for this session to prevent any side-effects of this feature
SQL> alter session set parallel_degree_policy = manual;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> -- Note that even a CTAS operation can be segment-less
SQL> -- if the query doesn't return any rows
SQL> create table t_deferred_parallel_dml
2 segment creation deferred
3 as
4 select * from t_source where 1 = 2;

Table created.

Elapsed: 00:00:00.01
SQL>
SQL> -- No segment although CTAS has been used
SQL> select count(*) from user_segments where segment_name = 'T_DEFERRED_PARALLEL_DML';
0

Elapsed: 00:00:00.00
SQL>
SQL> alter session enable parallel dml;

Session altered.

Elapsed: 00:00:00.00
SQL>
SQL> explain plan for
2 insert /*+ append parallel(t) */ into t_deferred_parallel_dml t select * from t_source;

Explained.

Elapsed: 00:00:00.01
SQL>
SQL> -- This plan does not show parallel DML
SQL> select * from table(dbms_xplan.display);
Plan hash value: 1453990251

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 73251 | 6938K| 117 (1)| 00:00:02 | | | |
| 1 | LOAD AS SELECT | T_DEFERRED_PARALLEL_DML | | | | | | | |
| 2 | PX COORDINATOR | | | | | | | | |
| 3 | PX SEND QC (RANDOM)| :TQ10000 | 73251 | 6938K| 117 (1)| 00:00:02 | Q1,00 | P->S | QC (RAND) |
| 4 | PX BLOCK ITERATOR | | 73251 | 6938K| 117 (1)| 00:00:02 | Q1,00 | PCWC | |
| 5 | TABLE ACCESS FULL| T_SOURCE | 73251 | 6938K| 117 (1)| 00:00:02 | Q1,00 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------

12 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> drop table t_deferred_parallel_dml purge;

Table dropped.

Elapsed: 00:00:00.03
SQL>
SQL> create table t_deferred_parallel_dml
2 segment creation immediate
3 as
4 select * from t_source where 1 = 2;

Table created.

Elapsed: 00:00:00.01
SQL>
SQL> -- One segment
SQL> select count(*) from user_segments where segment_name = 'T_DEFERRED_PARALLEL_DML';
1

Elapsed: 00:00:00.01
SQL>
SQL> explain plan for
2 insert /*+ append parallel(t) */ into t_deferred_parallel_dml t select * from t_source;

Explained.

Elapsed: 00:00:00.01
SQL>
SQL> -- This plan does show parallel DML
SQL> select * from table(dbms_xplan.display);
Plan hash value: 4111574378

------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
------------------------------------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 73251 | 6938K| 117 (1)| 00:00:02 | | | |
| 1 | PX COORDINATOR | | | | | | | | |
| 2 | PX SEND QC (RANDOM) | :TQ10000 | 73251 | 6938K| 117 (1)| 00:00:02 | Q1,00 | P->S | QC (RAND) |
| 3 | LOAD AS SELECT | T_DEFERRED_PARALLEL_DML | | | | | Q1,00 | PCWP | |
| 4 | PX BLOCK ITERATOR | | 73251 | 6938K| 117 (1)| 00:00:02 | Q1,00 | PCWC | |
| 5 | TABLE ACCESS FULL| T_SOURCE | 73251 | 6938K| 117 (1)| 00:00:02 | Q1,00 | PCWP | |
------------------------------------------------------------------------------------------------------------------------------

12 rows selected.

Elapsed: 00:00:00.06
SQL>
SQL> spool off


In general I would advise to disable deferred segment creation and enable it only in environments where a clear benefit can be seen and the side effects are known, understood and tested.

This can be accomplished by simply setting "deferred_segment_creation = false" on instance level.

Sunday, September 26, 2010

Cleanup of partitioned objects residing in offlined tablespaces

If you have the scenario that a large database that contains historical data - typically a data warehouse with partitioned tables where the partitions reside in different tablespaces depending on their age - is supposed to be duplicated but the target environment, typically a lower environment, like duplicating Production to a UAT environment, doesn't have sufficient space to hold the complete database and in particular the whole historic partitions, then you somehow need to deal with that partial clone.

Of course, in an ideal world the lower environment is supposed to have sufficient space to hold the complete database, but we don't live in an ideal world. Sometimes, depending on the storage technology, you might be able to do some kind of temporary "split mirror" operation that allows you to start up a complete clone of the database and drop all the affected objects / tablespaces before putting the cleaned up / partial database on the final target system, but this is not always applicable, for instance when using ASM-based storage.

The issue in that particular case is that you can do for instance a partial RMAN clone that omits the historical tablespaces, but the resulting database then obviously has some tablespaces that are unavailable and can't be onlined since the datafiles are simply missing.

In case of objects that are completely residing in the offlined tablespaces the solution is simple and straightforward: The objects can be dropped or the affected tablespaces can simply be dropped with the "including contents" clause.

Things get more complicated however in case of partitioned objects if some of the partitions reside in the offlined tablespaces.

In that case you can't simply drop the offlines tablespaces - you end up with Oracle error message "ORA-14404: partitioned table contains partitions in a different tablespace" for example that tells you that objects exist that have some of their partitions not residing in the offlined tablespace and Oracle obviously won't do the cleanup job for you in that case.

Oracle allows to drop partitions of a table that reside in offlined tablespaces, but there is an important limitation: You can only do so if no index has been defined on the table. Otherwise you get the error message "ORA-14117/ORA-14630: [sub]partition resides in offlined tablespace" when attempting to drop the partition. It's interesting to note that the mere existence of indexes is sufficient - it doesn't matter if the index is residing in an offlined or onlined tablespace, if it is unusable or not - you just need to have at least one index defined on the table, and the drop partition operation errors out. It looks like a hard coded "if index exists then error" code path. I'm not sure why exactly this restriction is in place, but it has serious consequences if you need to do this sort of cleanup for a large database with hundreds of thousands of partitions.

By the way it is interesting to know that Oracle allows to drop segments from offlined tablespaces that are locally managed - think about the consequence: The local bitmaps that represent the allocated and free space in the tablespace can not be updated at the time of the drop operation. This is one of the few disadvantages compared to the old-fashioned dictionary managed tablespaces where an operation like that potentially could have been limited to a pure dictionary operation.

The approach Oracle chooses is straightforward: The segments dropped are converted into "temporary" objects (kind of a special "recyclebin" if you want to say so) and if the offlined tablespace should get onlined again the local bitmaps will be updated according to these temporary objects and finally the temporary objects will be removed from the dictionary. Otherwise if the offlined tablespace gets dropped the corresponding temporary objects consequently can also be removed.

Tanel Poder recently also blogged about this in the context of read-only tablespaces.

So you could now drop all indexes from the affected tables and recreate them afterwards, but this is not a good idea for several reasons:
* If the remaining partitions still represent multiple terabytes it will take quite long to rebuild all indexes
* Things might go wrong and you end up with missing indexes

Since these are partitioned objects, another Oracle best practice might come to rescue: Exchange the affected partitions with an exchange table excluding indexes - this allows to drop the exchanged partition residing in an offlined tablespace without any error message. Note that this means that after you've exchanged the partition you need to drop the unpartitioned object (that doesn't have indexes) in order to get rid of the affected segment.

Now if you only have a few objects that are affected or only a single partition to clean up per table then you're basically done, but if you need to clean up multiple partitions and the number of partitions is high then this approach is not feasible, because it sounds like quite an overhead to drop / create a table for each partition that needs to be cleaned up - in particular when talking about hundred thousands of partitions.

The approach of creating a table to exchange partitions with adds another complexity to the problem: Many large databases make usage of the option in Oracle to set columns unused instead of actually dropping them, since the former is a simple meta data operation in the dictionary (renames the column and marks the column as hidden) and independent from the data volume and therefore almost instant but the latter requires Oracle to process the whole segment which can take quite some time in case of huge segments (and is not a very efficient operation by the way, but that is something for a different post).

Now creating a table that can be exchanged with such an evolved table that contains unused columns can not be accomplished by simply doing a handy CTAS operation - the unused columns will be missing from the copy and therefore the exchange operation will fail with "column mismatch" errors.

So you need to deal with these unused columns somehow in that moment you need to create a table dynamically, and you need to do that if you need to drop more than a single partition per table.

The only other viable option that I'm aware of is to maintain this exchange table as a partitioned table itself permanently - which means create two exact copies (in terms of dictionary meta data) of every potentially affected table and make sure to apply the same DDL in the same order which will ensure that the exchange operation with those tables will succeed.

It will need two copies, one unpartitioned and one partitioned, if you want to avoid the drop and dynamic create table operation, since Oracle allows to exchange partitions only with unpartitioned tables. So it would take two exchange operations per affected partition - the first with the unpartitioned table, and the second with the partitioned table. Both exchange operations would have to be done excluding indexes and at least the final partitioned table has to be created excluding indexes - note that Oracle allows the exchange operation even if one table is missing a potentially created primary key constraint if you use the "EXCLUDING INDEXES" clause.

Then you can drop the partition that has been exchanged into the second table since the table doesn't have any indexes defined on it.

This allows you to accomplish two objectives:

1. Avoid a costly (and potentially complex in case of unused columns) drop / create table operation per partition
2. Manage the drop / create cycle by simply dropping and adding partitions in the second table which is far less overhead compared to a drop / create table operation which is important when performance matters

However maintaining two copies for every table that potentially needs to be cleaned up sounds like a huge overhead, and even in Oracle 11.2 with the deferred segment creation at least the partitioned table copy will allocate space since the deferred segment creation option is not (yet) applicable to partitioned objects.

So here a few ideas how to accomplish such a cleanup operation:

1. For each table to process create the required table copies on the fly dynamically once. If you don't have to deal with unused columns then a simple CTAS operation can be used to create those tables. If you need to handle unused columns, a more complex approach is required. You will have to use custom queries against the data dictionary to extract the required information since the official DBMS_METADATA interface as far as I know doesn't expose information about unused columns.

The following query could be used as a starting point to extract the column definition of a table:


-- Get the column list including hidden columns, but ignoring virtual columns added by function-based indexes
-- TODO: What about virtual columns (11g or later)?
select
decode(t.hidden_column,
'YES', 'SYS$' || internal_column_id || '$UNUSED',
t.column_name
) ||
' ' ||
lower(data_type) ||
decode(substr(data_type, 1, 9),
'TIMESTAMP', null,
'INTERVAL ', null,
decode(t.data_type,
'NUMBER', decode(t.data_precision || t.data_scale,
null, null,
'(' ||
decode(t.data_precision,
null, '*',
t.data_precision
) ||
decode(t.data_scale,
null, null,
',' || t.data_scale
) ||
')'
),
'FLOAT', '(' || t.data_precision || ')',
'LONG', null,
'LONG RAW', null,
'BLOB', null,
'CLOB', null,
'NCLOB', null,
'BFILE', null,
'CFILE', null,
'BINARY_FLOAT', null,
'BINARY_DOUBLE', null,
'MLSLABEL', null,
'ROWID', null,
'UROWID', null,
'DATE', null,
'(' ||
nvl(
nvl(
decode(t.char_length,
0, to_number(null),
t.char_length
),
t.char_col_decl_length
),
t.data_length
) ||
decode(t.character_set_name,
'CHAR_CS', decode(t.char_used,
'C', ' char',
'B', ' byte'
)
) ||
')'
)
) ||
' ' ||
decode(nullable,
'N', 'not null',
null
) ||
',' as column_def
from
all_tab_cols t
where
t.owner = <OWNER>
and t.table_name = <TABLE_NAME>
and t.segment_column_id is not null
order by
t.internal_column_id;


This will generate SYS$<n>$UNUSED columns for all columns set to unused in the source table which can then subsequently be set to unused in the newly created table to have the same layout as the source table.

Note that it hasn't been tested with virtual columns yet, but should deal with most of the available data types and also handle columns using char semantics correctly (important for NCHAR / NVARCHAR based columns and databases running with multi-byte characters, in particular AL32UTF8). It also can handle virtual columns added via function-based indexes.

Of course the extraction process potentially needs to handle much more than shown here, for instance check the table properties like index organized tables, partition keys etc.

2. Process each affected table partition using the following approach:

* Add a new "clean" partition to the partitioned copy table in a tablespace that is online. The partition key is a "dummy" key that can be the same for each iteration of this process. This is the segment that is going to end up in the affected partitioned table and will be finally dropped since it then resides in a available tablespace. Remember, we want to get rid of those partitions residing in unavailable tablespaces, since we have not sufficient space in the clone of the database to make them available)

* Exchange the affected partition with the unpartitioned copy table EXCLUDING INDEXES (The unpartitioned table copy has also been initially created in an available tablespace)

* Exchange the unpartitioned copy table (that now holds the defect partition) with the newly created "clean" partition of the partitioned copy table EXCLUDING INDEXES

* Now the defect partition resides in the partitioned copy table and can simply be dropped since we don't have any indexes defined on this table

* Finally we can drop now the partition from the affected table (that is now the partition that has been added to the partitioned copy table in the previous iteration of the process) since the partition resides now in an available tablespace

The same process can be applied to a subpartitioned table. There are of course some more details to consider, for example, you can't drop the last subpartition or partition from a partition / table, but since we assume that our affected table in general still has some partitions left over that reside in current and available tablespaces it should be possible to handle these scenarios.

Here is a code snippet / template that applies this process:


/**
* Drop a (sub-)partition residing in an offline tablespace
* @param p_object The original object to clean up
* @param p_temp_exc_table_name The name of the working table
* @param p_clean_tablespace_name The tablespace where to create the cleanup partition
* @param p_partition_name The name of the (sub-)partition to drop
* @param p_segment_type The type of the segment (TABLE[ ][SUB][PARTITION])
* @param p_last_partition_indicator For subpartitions indicate that this is
* the last subpartition - this will drop the whole partition
**/
procedure drop_partition(
p_object in r_oracle_object
, p_temp_exc_table_name in out oracle_object
, p_clean_tablespace_name in oracle_object
, p_partition_name in oracle_object
, p_segment_type in oracle_object
, p_last_partition_indicator in varchar2
)
as
s_sql large_sql;
s_partition_type varchar2(255);
s_exchange_table oracle_object;
e_partition_in_offline_ts exception;
pragma exception_init(e_partition_in_offline_ts, -14117);
-- We do this in an autonomous transaction to keep it separate
-- from any outside transactions like the "queue" transaction
-- used by the parallel clean up threads
pragma autonomous_transaction;
begin
-- Determine if this is a partition or subpartition
s_partition_type := substr(p_segment_type, 7, 255);

-- Step 1: Add a clean partition to the temporary cleanup table
s_sql := 'alter table ' || p_temp_exc_table_name || ' add partition p_cleanup values (42) tablespace ' || p_clean_tablespace_name;

execute(p_object.owner, s_sql);

-- Get the name of the unpartitioned table copy
s_exchange_table := get_exchange_table(p_object);

-- Step 2: Exchange the defect (sub-)partition with the unpartitioned copy table EXCLUDING indexes
s_sql := '
alter table ' || p_object.segment_name || '
exchange ' || s_partition_type || ' ' || p_partition_name || '
with table ' || s_exchange_table || '
excluding indexes without validation';

execute(p_object.owner, s_sql);

-- Step 3: Exchange the unpartitioned copy table with the partitioned copy table partition again EXCLUDING indexes
s_sql := '
alter table ' || p_temp_exc_table_name || '
exchange partition p_cleanup
with table ' || s_exchange_table || '
excluding indexes without validation';

execute(p_object.owner, s_sql);

-- Step 4: Drop the defect partition now residing in partitioned "waste-bin"table partition
-- Since we don't have any indexes this is possible now
s_sql := 'alter table ' || p_temp_exc_table_name || ' drop partition p_cleanup';

begin
execute(p_object.owner, s_sql);
exception
-- Any tables with LOBs will cause an exception since they will implicitly have a LOB INDEX added
-- Therefore the DROP PARTITION will fail with ORA-14117
-- The simplest solution is to drop and recreate the entire working table - note the serious side effects
-- of the underlying ORA-01110 error in 11.1.0.7. See below comments about the potential impact
-- of the health check monitoring
-- A smoother way of handling this to avoid these potential side-effects is to check if the
-- table has any indexes defined on it and then do not attempt to drop the partition but
-- immediately drop the entire table
when e_partition_in_offline_ts then
drop_temporary_exc_table(p_object, p_temp_exc_table_name);
p_temp_exc_table_name := create_temporary_exc_table(p_object, p_clean_tablespace_name, p_segment_type);
end;

-- Step 5: Drop the partition / subpartition which completes the cleanup
-- TODO: Are really all subpartitions of a partition affected?
-- If not, this logic needs to be revised
if (s_partition_type = 'SUBPARTITION' and p_last_partition_indicator = 'Y') then
s_sql := 'alter table ' || p_object.segment_name || ' drop partition ' || get_parent_partition(p_object, p_partition_name);

execute(p_object.owner, s_sql);
elsif s_partition_type = 'PARTITION' then
s_sql := 'alter table ' || p_object.segment_name || ' drop partition ' || p_partition_name;

execute(p_object.owner, s_sql);
end if;
end drop_partition;


Note that tables with LOBs need special treatment since the logic used for tables without LOBs fails with an error that the partition resides in an offline tablespace when the "defect" partition gets dropped from the temporary cleanup table.

This is the error that is typically raised when attempting to drop a partition from a table having indexes defined on it. Very likely this error is caused by the fact that a LOB always has a corresponding LOB INDEX (that will be always exchanged when doing an EXCHANGE PARTITION even if EXCLUDING INDEXES is specified).

Since this index cannot be dropped from the temporary cleanup table we actually do not gain anything by performing the exchange exercise.
The simplest solution to the problem is to drop and recreate the cleanup table which is possible without the ORA-14117 error message being raised.

Note that attempting to drop the partition raising the ORA-14117 error has a serious side effect in 11.1.0.7 and 11.2.0.1: If the "Health Monitoring" detects a "flooding" ORA-1110 error (something is wrong with a datafile) any operation on these datafiles from then on seem to have tracing turned on - this leads to the odd situation that onlining such a tablespace again (which we can't do in our actual case since the datafiles are not there, but I've done for testing purposes) takes minutes and generates trace files in size between 20 and 500 MB (!).

I haven't investigated much into this, but in my tests this couldn't be avoided by setting the undocumented parameter "_disable_health_checks=TRUE".

The MMON_SLAVE process will dump something similar to this:


========= Dump for error ORA 1110 (no incident) ========
----- DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----


Also for the sessions that cause the error something like this will be written to trace files:


DDE rules only execution for: ORA 1110
----- START Event Driven Actions Dump ----
---- END Event Driven Actions Dump ----
----- START DDE Actions Dump -----
Executing SYNC actions
----- START DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (Async) -----
DDE Action 'DB_STRUCTURE_INTEGRITY_CHECK' was flood controlled
----- END DDE Action: 'DB_STRUCTURE_INTEGRITY_CHECK' (FLOOD CONTROLLED, 17 csec) -----
Executing ASYNC actions
----- END DDE Actions Dump (total 0 csec) -----


A smoother way of handling this to avoid these potential side-effects is to check if the table has any indexes defined on it and then do not attempt to drop the partition but immediately drop the entire table.

Since each iteration of this process takes some time (in a database with hundreds of thousands of partitions this can take a few seconds per partition due to the underlying recursive data dictionary operations) and is according to some traces mostly CPU bound you can speed up the processing by using multiple processes it if you have multiple CPUs available.

Using an approach with Advanced Queueing having multiple consumers (for instance as many consumers as CPUs) we were able to cleanup more than 200,000 partitions (table plus index (sub)partitions) in less than one hour which was within the available timeframe for that operation.

Friday, August 6, 2010

Time Model Bug

Tasks that are performed via jobs in the database will be double accounted in the system time model that has been introduced with Oracle 10g.

So if you execute significant workload via DBMS_JOB or DBMS_SCHEDULER any system time model related statistic like DB Time, DB CPU etc. that gets recorded for that workload gets double accounted.

This bug is not particularly relevant since your top workloads will still be the same top workloads, because all other statistics (like Elapsed Time, CPU, Buffer Gets etc.) are not affected by the bug.

I mention it only here since the bug (see below for details) as of the time of writing can't yet be found on My Oracle Support in the bug database but I recently came across several AWR reports where the majority of workload was generated via job processes and therefore the time model statistics were effectively doubled.

It might help as a viable explanation if you sometimes wonder why an AWR or Statspack report only captures 50% or less of the recorded total DB Time or DB CPU and where this unaccounted time has gone. If a significant part of the workload during the reporting period has been performed by sessions controlled via DBMS_JOB or DBMS_SCHEDULER then probably most of the unaccounted time is actually not unaccounted but the time model statistics are wrong.

So if you have such an otherwise unexplainable unaccounted DB Time / DB CPU etc. you might want to check if significant workload during the reporting period was executed via the job system. Note that I don't say that this is the only possible explanation of such unaccounted time - there might be other reasons like uninstrumented waits, other bugs etc.

Of course all the percentages that are shown in the AWR / ADDM / Statspack reports that refer to "Percentage of DB Time" or "Percentage of DB CPU" will be too small in such cases.

If the majority of workload during the reporting period has been generated by jobs then you can safely assume that the time model statistics have to be divided by 2 (and the percentages have to be doubled). If you have a mixture of jobs and regular foreground sessions then it will be harder to derive the correct time model statistics.

Note that the "Active Session History" (ASH) is not affected by the bug - the ASH reports always were consistent in my tests regarding the DB Time (respectively the number of samples) and CPU time information.

The following simple test case can be used to reproduce the issue at will. Ideally you should have exclusive access to the test system since any other concurrent activity will affect the test results.

You might want to check the 1000000000 iterations of the simple PL/SQL loop on your particular CPU - on my test system this takes approx. 46 seconds to complete.

The first version assumes that a PERFSTAT user with an installed STATSPACK is present in the database since STATSPACK doesn't require an additional license. An AWR variant follows below.


alter session set nls_language = american nls_territory = america;

store set .settings replace

set echo on timing on define on

define iter="1000000000"

variable snap1 number

exec :snap1 := statspack.snap

declare
n_cnt binary_integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
end;
/

variable snap2 number

exec :snap2 := statspack.snap

/* Uncomment this if you want to test via DBMS_JOB
variable job_id number

begin
dbms_job.submit(:job_id, '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
');
end;
/

commit;
*/

/* Uncomment this if you want to test via DBMS_SCHEDULER */
begin
dbms_scheduler.create_job(
job_name => dbms_scheduler.generate_job_name
, job_type => 'PLSQL_BLOCK'
, job_action => '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
' , enabled => true);
end;
/

declare
pipe_status integer;
begin
pipe_status := dbms_pipe.receive_message('bg_job_complete');
end;
/

declare
pipe_id integer;
begin
pipe_id := dbms_pipe.remove_pipe('bg_job_complete');
end;
/

variable snap3 number

exec :snap3 := statspack.snap

rem set heading off pagesize 0 feedback off linesize 500 trimspool on termout off echo off verify off

prompt Enter PERFSTAT password

connect perfstat

column dbid new_value dbid noprint

select dbid from v$database;

column instance_number new_value inst_num noprint

select instance_number from v$instance;

column b_id new_value begin_snap noprint
column e_id new_value end_snap noprint

select :snap1 as b_id, :snap2 as e_id from dual;
define report_name=sp_foreground.txt

@?/rdbms/admin/sprepins

column dbid new_value dbid noprint

select dbid from v$database;

column instance_number new_value inst_num noprint

select instance_number from v$instance;

column b_id new_value begin_snap noprint
column e_id new_value end_snap noprint

select :snap2 as b_id, :snap3 as e_id from dual;
define report_name=sp_background.txt

@?/rdbms/admin/sprepins

undefine iter

@.settings

set termout on


Here is the same test case but with AWR reports (requires additional diagnostic license)


alter session set nls_language = american nls_territory = america;

store set .settings replace

set echo on timing on define on

define iter="1000000000"

column snap1 new_value awr_snap1 noprint

select dbms_workload_repository.create_snapshot as snap1 from dual;

declare
n_cnt binary_integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
end;
/

column snap2 new_value awr_snap2 noprint

select dbms_workload_repository.create_snapshot as snap2 from dual;

/* Uncomment this if you want to test via DBMS_JOB
variable job_id number

begin
dbms_job.submit(:job_id, '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
');
end;
/

commit;
*/

/* Uncomment this if you want to test via DBMS_SCHEDULER */
begin
dbms_scheduler.create_job(
job_name => dbms_scheduler.generate_job_name
, job_type => 'PLSQL_BLOCK'
, job_action => '
declare
n_cnt binary_integer;
n_status integer;
begin
n_cnt := 0;
for i in 1..&iter loop
n_cnt := n_cnt + 1;
end loop;
n_status := dbms_pipe.send_message(''bg_job_complete'');
end;
' , enabled => true);
end;
/

declare
pipe_status integer;
begin
pipe_status := dbms_pipe.receive_message('bg_job_complete');
end;
/

declare
pipe_id integer;
begin
pipe_id := dbms_pipe.remove_pipe('bg_job_complete');
end;
/

column snap3 new_value awr_snap3 noprint

select dbms_workload_repository.create_snapshot as snap3 from dual;

set heading off pagesize 0 feedback off linesize 500 trimspool on termout off echo off verify off

spool awr_foreground.html

select
output
from
table(
sys.dbms_workload_repository.awr_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap1
, &awr_snap2
)
);

spool off

spool awr_background.html

select
output
from
table(
sys.dbms_workload_repository.awr_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap2
, &awr_snap3
)
);

spool off

spool awr_diff.html

select
output
from
table(
sys.dbms_workload_repository.awr_diff_report_html(
(select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap1
, &awr_snap2
, (select dbid from v$database)
, (select instance_number from v$instance)
, &awr_snap2
, &awr_snap3
)
);

spool off

undefine awr_snap1
undefine awr_snap2
undefine awr_snap3

undefine iter

column snap1 clear
column snap2 clear
column snap3 clear

@.settings

set termout on


And here is a sample snippet from a generated Statspack report on a single CPU system with nothing else running on the system:

Normal foreground execution:


STATSPACK report for

Database DB Id Instance Inst Num Startup Time Release RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
orcl112 1 05-Aug-10 08:21 11.2.0.1.0 NO

Host Name Platform CPUs Cores Sockets Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
XXXX Microsoft Windows IA ( 1 0 0 2.0

Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
~~~~~~~~ ---------- ------------------ -------- --------- ------------------
Begin Snap: 13 05-Aug-10 08:34:17 25 1.2
End Snap: 14 05-Aug-10 08:35:05 25 1.2
Elapsed: 0.80 (mins) Av Act Sess: 1.1
DB time: 0.87 (mins) DB CPU: 0.80 (mins)

Cache Sizes Begin End
~~~~~~~~~~~ ---------- ----------
Buffer Cache: 104M Std Block Size: 8K
Shared Pool: 128M Log Buffer: 6,076K

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ ------------------ ----------------- ----------- -----------
DB time(s): 1.1 2.4 0.09 3.99
DB CPU(s): 1.0 2.2 0.08 3.68


Execution via Job/Scheduler:


STATSPACK report for

Database DB Id Instance Inst Num Startup Time Release RAC
~~~~~~~~ ----------- ------------ -------- --------------- ----------- ---
orcl112 1 05-Aug-10 08:21 11.2.0.1.0 NO

Host Name Platform CPUs Cores Sockets Memory (G)
~~~~ ---------------- ---------------------- ----- ----- ------- ------------
XXXX Microsoft Windows IA ( 1 0 0 2.0

Snapshot Snap Id Snap Time Sessions Curs/Sess Comment
~~~~~~~~ ---------- ------------------ -------- --------- ------------------
Begin Snap: 14 05-Aug-10 08:35:05 25 1.2
End Snap: 15 05-Aug-10 08:35:53 24 1.3
Elapsed: 0.80 (mins) Av Act Sess: 1.9
DB time: 1.55 (mins) DB CPU: 1.54 (mins)

Cache Sizes Begin End
~~~~~~~~~~~ ---------- ----------
Buffer Cache: 104M Std Block Size: 8K
Shared Pool: 128M Log Buffer: 6,076K

Load Profile Per Second Per Transaction Per Exec Per Call
~~~~~~~~~~~~ ------------------ ----------------- ----------- -----------
DB time(s): 1.9 92.8 0.79 7.74
DB CPU(s): 1.9 92.1 0.78 7.68


As you might have guessed my single CPU test system has not been added a second CPU when performing the same task via DBMS_SCHEDULER / DBMS_JOB yet the time model reports (almost) 2 DB Time / DB CPU seconds and active sessions per second in that case.

I have reproduced the bug on versions 10.2.0.4, 11.1.0.7 and 11.2.0.1 but very likely all versions supporting the time model are affected.

A (non-public) bug "9882245 - DOUBLE ACCOUNTING OF SYS MODEL TIMINGS FOR WORKLOAD RUN THROUGH JOBS" has been filed for it, but the fix is not available yet therefore as far as I know it is not yet part of any available patch set / PSU.

Note that there seems to a different issue with the DB CPU time model component: If you have a system that reports more CPUs than sockets (for example a Power5, Power6 or Power7 based IBM server that reports 16 sockets / 32 CPUs) then the DB CPU component gets reduced by approximately 50%, which means it is divided by 2.

This means in combination with above bug that you end up with a doubled DB Time component for tasks executed via jobs, but the DB CPU time model component is in the right ballpark since the doubled DB CPU time gets divided by 2.

I don't know if the bug fix also covers this issue, so you might want to keep this in mind when checking any time model based information.

Saturday, July 24, 2010

Double Trouble

One of my clients runs a Data Warehouse with several important data loading and transformation procedures.

One of those key procedures had to be executed in a kind of a loop since many of the transformations steps depend on each other's result.

So although each transformation step was executed in an rather efficient way using Oracle best-practices like parallel DML / exchange partition still the iterative approach meant that the underlying server wasn't used as efficiently as possible since many of those transformations steps also included smaller data sets that didn't use all of the resources available.

In order to address that they've recently introduced a more sophisticated version of that key procedure that uses some clever algorithm based on a dependency tree to determine which steps could be executed in parallel since they do not depend on each other.

Now as part of that optimization they've worked out that on their particular hardware they could run as much as 25 transformation steps in parallel and furthermore up to four of those 25 as parallel execution steps.

So you end up in principle with two queues - one is dequeued by four consumers processing "big" slices via parallel execution and a second queue is dequeued by approx. 20 consumers each processing "small" slices using serial execution (It is quite a big "iron" hardware).

This allowed them to increase the resource utilization of the available server hardware and resulted in a significant drop in overall runtime - actually they saved more than 50% of the original runtime which is quite a good result given that they have achieved that on exactly the same hardware!

So far, so very good - now they got an even "bigger" server with newer CPUs and ran the same database with the same configuration, the only change were the CPUs (which required also a newer version of the operating system, but that is a different story). Since most of the processes really use a lot of CPU (due to the design of the system and therefore running them in parallel makes good usage of all the available CPU power since I/O is not the only constraining factor) they expected a further drop in runtime but to their surprise the key procedure they tested took far longer than on the old hardware.

An analysis showed a really interesting case: It turned out that one of the "weak" points of the system which are rather inefficient user-defined PL/SQL functions that perform recursive SQL were going sideways in their performance and required almost hundred times longer per execution than before.

The explanation was rather stunning: Although no change to the database and in particular no code change to the user defined functions was introduced these recursive queries issued by the user-defined functions suddenly started to be executed in parallel using parallel query - which is a rather bad idea when getting executed 20 million times within an hour or so.

The interesting side effect was that on the operating system side it could be monitored that the kernel time increased to more than 50% of the available CPU time bringing the whole system more or less to a grinding halt.

Now the big question was: How could this have happened? This has never happened before on any of the previous systems but the same code / database on the new system showed this weird behaviour.

The first surprising finding was that the recursive queries showed a parallel execution plan when explaining a plan for them - simply due to the fact that the underlying tables and indexes were marked as parallel. This was an obvious bug in the code that seemed to be there from the beginning but hadn't surfaced yet.

So the question became more: Why had this not happened all the time in the past?

Looking further it turned out that the new CPUs resulted in a doubled CPU_COUNT reported by the operating system to Oracle - and they've used a DEFAULT parallel degree for their parallel executions.

This resulted obviously in a doubled degree of parallelism (DOP) of the parallel executions, but they still used the same instance configuration - in particular the same PARALLEL_MAX_SERVERS setting.

So what happened was that using the same approach as described above (using four consumers for the "big" queue) they simply ran out of parallel servers resulting in severe downgrades for some of the intended parallel executions - and some of them were even downgraded to serial.

And here these two issues together allowed the bug above to surface: Once the parallel execution was downgraded to serial the recursive query was eligible to be executed in parallel. Since the user-defined functions were marked with "PARALLEL_ENABLE" they usually were pushed into the parallel slaves and got executed there. Oracle recognizes this execution in the slaves and prevents a recursive parallel execution implicitly - you could call this a limitation or restriction, but in this case it prevented the bug from showing up.

So, in summary the following happened:

- In the past the old implementation ran at most one parallel execution at the same time, so due to the restrictions enforced via the Resource Manager for other sessions no serial downgrade was possible, so the recursive query was pushed into the slaves and executed serially there - no problems so far

- A new implementation was introduced that used a clever queuing approach. As part of that queuing multiple parallel executions happened simultaneously, but due to the carefully chosen settings these were not downgraded, at least not to serial and the recursive query ran as intended in the parallel slaves - still no problems

- A new hardware was introduced, doubling the CPU_COUNT and therefore doubling the default parallel degree, and bingo: Using the same carefully chosen settings parallel executions were downgraded to serial due to the doubled DOP, and the recursive query eventually started to be executed in parallel - the performance went south

After fixing the issue in the code by explicitly forcing serial execution of the recursive query even with the unadjusted settings causing serial downgrades the problem didn't show up anymore.

What can be taken away from this?

- Oracle treats things in parallel slaves differently from things executed in normal sessions

- Using a DEFAULT parallel degree can lead to interesting effects when modifying the underlying hardware / configuration

- Even if your code ran without problems for years it doesn't mean there isn't a bug waiting for you

Just for curiosity, here is a generic testcase that can be used to reproduce the issue at will:


-----------------------------
-- The complete testcase
-----------------------------

-- Tested with the following settings:
-- NO RAC
-- PARALLEL_MAX_SERVERS = 4
-- PARALLEL_ADAPTIVE_MULTI_USER = TRUE
-- PARALLEL_AUTOMATIC_TUNING = FALSE
-- If above parameters are different the setup needs to adjusted

-----------------------------
-- The setup
-----------------------------

alter system set parallel_max_servers = 4 parallel_adaptive_multi_user = true scope = memory;

drop package pkg_recursive_query;

drop table t_parallel_index purge;

drop table t_driver purge;

create table t_parallel_index (
id integer not null,
large_vc varchar2(200),
pkey integer not null
)
partition by range (pkey)
(
partition pkey_0 values less than (1),
partition pkey_1 values less than (2),
partition pkey_2 values less than (3)
);

create index t_parallel_index_idx on t_parallel_index (id) local parallel 2;

exec dbms_random.seed(0)

insert into t_parallel_index (
id
, large_vc
, pkey
)
select
rownum as id
, dbms_random.string('X', 100)
, 1
from
dual
connect by
level <= 10000;

commit;

insert into t_parallel_index (
id
, large_vc
, pkey
)
select
rownum as id
, dbms_random.string('X', 100)
, 2
from
dual
connect by
level <= 10000;

commit;

exec dbms_stats.gather_table_stats(null, 't_parallel_index')

explain plan for
select
large_vc
from
t_parallel_index
where
id = 2;

-- This is going to be a parallel index range scan by default
select * from table(dbms_xplan.display);

create table t_driver (
id integer not null,
pkey integer not null
);

alter table t_driver add constraint pk_t_driver primary key (id);

insert into t_driver (
id
, pkey
) values (
100
, 1
);

insert into t_driver (
id
, pkey
) values (
200
, 2
);

commit;

exec dbms_stats.gather_table_stats(null, 't_driver')

explain plan for
select /*+ opt_param('parallel_execution_enabled', 'true') */
f.large_vc
from
t_driver d
, t_parallel_index f
where
d.id = 100
and f.pkey = d.pkey
and f.id = 2;

-- This is again going to be a parallel index range scan by default
select * from table(dbms_xplan.display);

-- The user-defined function performing recursive SQL
create or replace package pkg_recursive_query
as
function get_value(p_f_id in integer, p_d_id in integer) return varchar2 parallel_enable;
end pkg_recursive_query;
/

create or replace package body pkg_recursive_query
as
function get_value(p_f_id in integer, p_d_id in integer) return varchar2 parallel_enable
as
s_value varchar2(200);
begin
select /*+ opt_param('parallel_execution_enabled', 'true') */
/* no_parallel_index(f) */
f.large_vc
into
s_value
from
t_driver d
, t_parallel_index f
where
d.id = p_d_id
and f.pkey = d.pkey
and f.id = p_f_id;

return 'SID: ' || sys_context('USERENV', 'SID') || '|' || s_value;
exception
when NO_DATA_FOUND then
s_value := 'SID: ' || sys_context('USERENV', 'SID') || '|<NO DATA FOUND>' ;
return s_value;
end get_value;
end pkg_recursive_query;
/

-----------------------------
-- The steps to perform
-----------------------------

-- First session
-- The main query is executed in parallel
-- Due to PARALLEL_ENABLE the recursive query is pushed into the parallel slaves
-- Being executed in the slaves forces serial execution of the recursive query
-- Keep the result set open to allocate the parallel slaves
set echo on pause on arraysize 1 pagesize 3 linesize 120

column result format a50 truncated

select /*+ parallel(t, 2) full(t) */
pkg_recursive_query.get_value(t.id, 100) as result
from
t_parallel_index t
where
id <= 50;

-- Second session
-- Simply allocate the second set of parallel slaves
-- Now all parallel slaves are allocated
-- Any further request for parallel execution will be downgraded to serial
set echo on

variable c refcursor

begin
open :c for '
select /*+ parallel(t, 2) full(t) */
*
from
t_parallel_index t
where
id <= 50
';
end;
/

-- Third session
-- The main query is now downgraded to serial
-- Now that the main query is serialized the recursive query potentially can get executed in parallel
-- But no parallel slaves are available so it shares the serial execution with the existing child cursor in the library cache
-- Keep the result set open
set echo on pause on arraysize 1 pagesize 3 linesize 120

column result format a50 truncated

select /*+ parallel(t, 2) full(t) */
pkg_recursive_query.get_value(t.id, 100) as result
from
t_parallel_index t
where
id <= 50;

-- Fourth session forces invalidation of child cursor for recursive query
alter table t_parallel_index truncate partition pkey_0;

-- Now close the second session to free parallel slaves
-- Further executions of the recursive subquery of session 3 will be performed in parallel
-- as long as parallel slaves are available otherwise it will be downgraded to serial

Sunday, July 11, 2010

Compression Restrictions

When using table or index compression certain restrictions apply. Since I find it always hard to gather that information from the manuals and also some of the restrictions are not documented properly or not at all, I'll attempt to summarize the restrictions that I'm aware of here:

1. Compression attribute on subpartitions
Note that you can't define the COMPRESSION on subpartition level - the subpartitions inherit this attribute from the parent partition.

It is however perfectly valid and legal to exchange a subpartition with a compressed table and that way introduce individual compression on subpartition level. It is unclear why this is only possible using this indirect method - it simply looks like a implementation restriction. Since compressed data is most suitable for data warehouse environments and these ought to use EXCHANGE [SUB]PARTITION technology anyway this restriction is probably not that crucial.

2. Number of columns
Basic heap table compression and the new OLTP (in 11.1 called "for all operations") compression are both limited to 255 columns. If you attempt to compress a table with more than 255 columns then no compression will be performed although you don't get an error message. Interestingly the COMPRESSION column in the dictionary still shows "ENABLED" which is certainly misleading.

3. DDL Restrictions - modification of table structure
As soon as a segment contains compressed data or has been marked for compression (does not apply in all cases, see below for more information), the following restrictions regarding the modification of the table structure apply:

a. Basic heap table compression will not prevent the addition of more than 255 columns via ADD COLUMN - but if you update the segment the affected blocks will be silently decompressed during the update operation. In 9.2 no schema evolution is possible with compressed segments, see the next paragraph.

b. In Oracle 9.2 a lot of modifications to the table are no longer possible with compression enabled (and this applies to partitioned tables as well if at least one partition is compressed):

- You cannot add or drop any columns, and you even can't set any column unused. The error messages are to say at least confusing, since it tells you something about virtual and object columns, but in fact it is about the enabled compression. These are serious limitations in terms of schema evolution. I'll address below what can be done to overcome this issue.


Oracle9i Enterprise Edition Release 9.2.0.8.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.8.0 - Production

SQL> create table t_part
2 compress
3 partition by list (user_id) (
4 partition p_default values (default)
5 )
6 as
7 select * from all_users;

Table created.

SQL> alter table t_part set unused (created);
alter table t_part set unused (created)
*
ERROR at line 1:
ORA-12996: cannot drop system-generated virtual column

SQL> alter table t_part drop unused columns;

Table altered.

SQL> alter table t_part drop column username;
alter table t_part drop column username
*
ERROR at line 1:
ORA-12996: cannot drop system-generated virtual column

SQL> alter table t_part add test_col varchar2(10);
alter table t_part add test_col varchar2(10)
*
ERROR at line 1:
ORA-22856: cannot add columns to object tables

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-22856: cannot add columns to object tables

SQL>


c. In Oracle 10.2 some restrictions have been lifted but there are still a couple of limitations:

- You can now add columns and you can set columns unused but you can't drop columns nor can you add columns with default values

This allows schema evolution but has interesting side effects if you attempt to use for example a Create Table As Select (CTAS) statement to dynamically create a copy of a table and attempt to perform an EXCHANGE PARTITION operation with that newly created table: If there are unused columns in the source table then these won't be copied over to the table created via CTAS and therefore the EXCHANGE PARTITION operation will fail with column mismatch error messages. Since you can't drop the unused columns as long as there is compression enabled on the table (more on that later since it is a bit more complex with partitioned tables), you effectively can't use that approach. In order to perform EXCHANGE PARTITION with compressed tables and evolved schemas that include unused columns you have basically three choices:

- Maintain the "EXCHANGE" table along with the target table which means apply the same DDL in the same order to both tables in order to keep them synchronized

- Extract the hidden columns from the dictionary and create a exchange table with the correct hidden/unused columns which can turn into a complex operation if you need to cover all possible column data types and features including LOBs etc.

- Perform some kind of rebuild operation on the target table to get rid of the unused columns. Some ideas regarding this point are going to follow further below


Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create table t_part
2 compress
3 partition by list (user_id) (
4 partition p_default values (default)
5 )
6 as
7 select * from all_users;

Table created.

SQL> alter table t_part set unused (created);

Table altered.

SQL> alter table t_part drop unused columns;
alter table t_part drop unused columns
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part drop column username;
alter table t_part drop column username
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part add test_col varchar2(10);

Table altered.

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL>


d. Oracle 11 introduces Advanced Compression which is COMPRESS FOR ALL OPERATIONS in 11.1 or OLTP compression in 11.2 (don't confuse this with HCC compression levels)

Basic compression in 11.1 has the same limitations as 10.2, but with advanced compression (called COMPRESS FOR ALL OPERATIONS in 11.1) all mentioned schema evolution restrictions have been lifted as it can be seen below.


Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create table t_part
2 --compress
3 compress for all operations
4 partition by list (user_id) (
5 partition p_default values (default)
6 )
7 as
8 select * from all_users;

Table created.

SQL> alter table t_part set unused (created);

Table altered.

SQL> alter table t_part drop unused columns;

Table altered.

SQL> alter table t_part drop column username;

Table altered.

SQL> alter table t_part add test_col varchar2(10);

Table altered.

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';

Table altered.

SQL>


This is more of theoretical nature since you usually won't mix basic with advanced compression, bit things become more interesting when mixing basic and advanced compression in different partitions of the same table:


Oracle Database 11g Enterprise Edition Release 11.1.0.7.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create table t_part
2 compress
3 --compress for all operations
4 partition by list (user_id) (
5 partition p_default values (default)
6 )
7 as
8 select * from all_users;

Table created.

SQL> alter table t_part set unused (created);

Table altered.

SQL> alter table t_part drop unused columns;
alter table t_part drop unused columns
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part drop column username;
alter table t_part drop column username
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part add test_col varchar2(10);

Table altered.

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part split partition p_default values (123)
2 into (partition p_123 compress for all operations, partition p_default);

Table altered.

SQL> select
2 partition_name
3 , compression
4 , compress_for
5 from
6 user_tab_partitions
7 where
8 table_name = 'T_PART';

PARTITION_NAME COMPRESS COMPRESS_FOR
------------------------------ -------- ------------------
P_DEFAULT ENABLED DIRECT LOAD ONLY
P_123 ENABLED FOR ALL OPERATIONS

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';

Table altered.

SQL> alter table t_part drop unused columns;
alter table t_part drop unused columns
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part drop column username;
alter table t_part drop column username
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL>


Notice the inconsistent behaviour: The first DDL command after having added a partition with advanced compression succeeds, but the following DDLs fail with an error message.

If the order of these DDLs is changed, then it becomes obvious that always the first DDL is successful no matter which of those failing otherwise comes first.

Furthermore in 11.1.0.7 if you have a mixture of Advanced Compression (COMPRESS FOR ALL OPERATIONS) and NOCOMPRESS partitions still above restrictions apply - only if you have set all partitions to COMPRESS FOR ALL OPERATIONS then the restrictions are lifted which looks more like a bug than a feature.

I'll go into more details below which settings influence this behaviour.

e. Oracle 11.2 introduces Hybrid Columnar Compression (HCC) which is only enabled in conjunction with ExaData

In 11.2 basic compression still has the same restrictions as already mentioned above, and if at least one partition of a partitioned table has advanced compression enabled then some of restrictions are lifted - even with a mixture of basic and advanced compression or a mixture of no compression and advanced compression, however adding a column with a default value is still only supported if all partitions are compressed with Advanced Compression. I'm not sure if this is really a feature since at least mixing Advanced Compression with no compression adding a column with default value should be possible.


Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create table t_part
2 compress
3 --compress for all operations
4 partition by list (user_id) (
5 partition p_default values (default)
6 )
7 as
8 select * from all_users;

Table created.

SQL> alter table t_part set unused (created);

Table altered.

SQL> alter table t_part drop unused columns;
alter table t_part drop unused columns
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part drop column username;
alter table t_part drop column username
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part add test_col varchar2(10);

Table altered.

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part split partition p_default values (123)
2 into (partition p_123 compress for all operations, partition p_default);

Table altered.

SQL> select
2 partition_name
3 , compression
4 , compress_for
5 from
6 user_tab_partitions
7 where
8 table_name = 'T_PART';

PARTITION_NAME COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
P_123 ENABLED OLTP
P_DEFAULT ENABLED BASIC

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part drop unused columns;

Table altered.

SQL> alter table t_part drop column username;

Table altered.

SQL> alter table t_part move partition p_default compress for all operations;

Table altered.

SQL> select
2 partition_name
3 , compression
4 , compress_for
5 from
6 user_tab_partitions
7 where
8 table_name = 'T_PART';

PARTITION_NAME COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
P_123 ENABLED OLTP
P_DEFAULT ENABLED OLTP

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';

Table altered.

SQL>


When using HCC the most important difference is that it is not restricted to 255 columns - it successfully compresses segments with any number of columns (up to the hard limit of 1,000 columns).

HCC can only be used with direct-path operations, any conventional DML will decompress the affected blocks and use OLTP compression instead - which means that in case the segment has more than 255 columns the compression will effectively be disabled since any non-HCC compression supports only up to 255 columns.

HCC has only a few restrictions regarding schema evolution - the most noticeable one is that DROP COLUMN is always turned into a SET COLUMN UNUSED, so with HCC enabled you can't drop columns since DROP UNUSED COLUMNS is not supported and raises an error. This means that dropping / adding columns works towards the hard limit of 1,000 columns with HCC enabled.

Again this has the same side effect as described above - a simple CTAS operation to create a table to be exchanged with won't work any longer with unused columns.


SQL> create table t_part
2 compress for query high
3 --compress for all operations
4 partition by list (user_id) (
5 partition p_default values (default)
6 )
7 as
8 select * from all_users;

Table created.

SQL> alter table t_part set unused (created);

Table altered.

SQL> alter table t_part drop unused columns;
alter table t_part drop unused columns
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part drop column username;

Table altered.

SQL> alter table t_part add test_col varchar2(10);

Table altered.

SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';

Table altered.

SQL> alter table t_part split partition p_default values (123)
2 into (partition p_123 compress for all operations, partition p_default);

Table altered.

SQL> select
2 partition_name
3 , compression
4 , compress_for
5 from
6 user_tab_partitions
7 where
8 table_name = 'T_PART';

PARTITION_NAME COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
P_123 ENABLED OLTP
P_DEFAULT ENABLED QUERY HIGH

SQL> alter table t_part add test_col3 varchar2(10) default 'BLA';

Table altered.

SQL> alter table t_part drop unused columns;
alter table t_part drop unused columns
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table t_part drop column username;
alter table t_part drop column username
*
ERROR at line 1:
ORA-00904: "USERNAME": invalid identifier

SQL> alter table t_part move partition p_default compress for all operations;

Table altered.

SQL> select
2 partition_name
3 , compression
4 , compress_for
5 from
6 user_tab_partitions
7 where
8 table_name = 'T_PART';

PARTITION_NAME COMPRESS COMPRESS_FOR
------------------------------ -------- ------------
P_123 ENABLED OLTP
P_DEFAULT ENABLED OLTP

SQL> alter table t_part add test_col4 varchar2(10) default 'BLA';

Table altered.

SQL>


And here is what happens if you're already hitting the 1,000 columns limit and attempt to use DROP [UNUSED] COLUMNS with HCC enabled:


SQL> alter table many_x_cols drop (col256);

Table altered.

SQL> alter table many_x_cols drop unused columns;
alter table many_x_cols drop unused columns
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> alter table many_x_cols add (col256 varchar2(10));
alter table many_x_cols add (col256 varchar2(10))
*
ERROR at line 1:
ORA-01792: maximum number of columns in a table or view is 1000

SQL> alter table many_x_cols move pctfree 0 nocompress;

Table altered.

SQL> alter table many_x_cols drop unused columns;

Table altered.

SQL> alter table many_x_cols add (col256 varchar2(10));

Table altered.

SQL>


As you can see dropping columns is only supported with HCC disabled which allows in this particular case to add another column after dropping the hidden/unused one.

- Index restrictions

* You cannot compress a table or a partition of table (or exchange a compressed partition into it) if there are usable bitmap indexes defined on the table due to the different Hakan factor that is reset when using by the compression - which means that with compressed segments more rows will fit into a single block leading to a different Hakan factor used for the bitmap indexes. The bitmap indexes need to be set unused (all partitions in case of a partitioned table/index) or dropped before enabling the compression and rebuild / recreated after compressing the heap segment. This is also officially documented. It is however interesting to note that this restriction is only enforced when switching from uncompressed to compressed segments / partitions. Changing the compression for example from basic compression to HCC compression is possible without hitting this restriction but due to the different compression levels in my opinion this potentially leads again to significantly different Hakan factors - so actively resetting the Hakan Factor using "ALTER TABLE ... MINIMIZE RECORDS_PER_BLOCK" and rebuilding the bitmap indexes might be in order when moving between different compression levels.


Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter table t_part move partition p_default nocompress;

Table altered.

SQL> alter table t_part move partition p_123 nocompress;

Table altered.

SQL> alter table t_part modify default attributes nocompress;

Table altered.

SQL> alter table t_part nocompress;

Table altered.

SQL> create bitmap index t_part_idx_bitmap on t_part (test_col) local;

Index created.

SQL> alter table t_part move partition p_default compress basic;
alter table t_part move partition p_default compress basic
*
ERROR at line 1:
ORA-14646: Specified alter table operation involving compression cannot be
performed in the presence of usable bitmap indexes

SQL> alter table t_part minimize records_per_block;
alter table t_part minimize records_per_block
*
ERROR at line 1:
ORA-28602: statement not permitted on tables containing bitmap indexes

SQL> alter index t_part_idx_bitmap modify partition p_default unusable;

Index altered.

SQL> alter table t_part move partition p_default compress basic;
alter table t_part move partition p_default compress basic
*
ERROR at line 1:
ORA-14646: Specified alter table operation involving compression cannot be
performed in the presence of usable bitmap indexes

SQL> alter table t_part minimize records_per_block;
alter table t_part minimize records_per_block
*
ERROR at line 1:
ORA-28602: statement not permitted on tables containing bitmap indexes

SQL> alter index t_part_idx_bitmap unusable;

Index altered.

SQL> alter table t_part move partition p_default compress basic;

Table altered.

SQL> alter table t_part minimize records_per_block;

Table altered.

SQL> alter index t_part_idx_bitmap rebuild partition p_default;

Index altered.

SQL> alter index t_part_idx_bitmap rebuild partition p_123;

Index altered.

SQL> alter table t_part move partition p_default nocompress update indexes;

Table altered.



As you can see switching on the compression is only allowed if all partitions of the bitmap index(es) are unusable - the opposite is not true: You can uncompress afterwards without any restrictions.

* This is related to B*Tree index compression and again more of theoretical nature but might become relevant when trying to introduce B*Tree index compression partition-wise via EXCHANGE PARTITION: You can't compress a single partition of a partitioned B*Tree index (or exchange a partition with a compressed index into the table) if the whole index has been not initially created with compression enabled. If you drop the index and recreate the whole index with compression enabled then you can turn the index compression individually on and off per index partition.

Very likely this is related to the restriction that the B*Tree index compression prefix length needs to be the same across all index partitions - this can only be defined on global level and I assume this is the reason why the index needs to be created compressed first. Since you usually want to have all partitions of an partitioned index to be compressed the same way this is probably a rather irrelevant restriction (except for the mentioned case regarding EXCHANGE PARTITION)


Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create index t_part_idx on t_part (user_id, test_col) nocompress local;

Index created.

SQL> alter index t_part_idx rebuild partition p_default compress;
alter index t_part_idx rebuild partition p_default compress
*
ERROR at line 1:
ORA-28659: COMPRESS must be specified at object level first

SQL> select
2 partition_name
3 , compression
4 from
5 user_ind_partitions
6 where
7 index_name = 'T_PART_IDX';

PARTITION_NAME COMPRESS
------------------------------ --------
P_123 DISABLED
P_DEFAULT DISABLED

SQL> drop index t_part_idx;

Index dropped.

SQL> create index t_part_idx on t_part (user_id, test_col) compress local;

Index created.

SQL> alter index t_part_idx rebuild partition p_default nocompress;

Index altered.

SQL> select
2 partition_name
3 , compression
4 from
5 user_ind_partitions
6 where
7 index_name = 'T_PART_IDX';

PARTITION_NAME COMPRESS
------------------------------ --------
P_123 ENABLED
P_DEFAULT DISABLED

SQL> drop index t_part_idx;

Index dropped.

SQL> create index t_part_idx on t_part (user_id, test_col) nocompress global
2 partition by range(user_id) (
3 partition p_default values less than (maxvalue)
4 );

Index created.

SQL> alter index t_part_idx rebuild partition p_default compress;
alter index t_part_idx rebuild partition p_default compress
*
ERROR at line 1:
ORA-28659: COMPRESS must be specified at object level first

SQL> drop index t_part_idx;

Index dropped.

SQL> create index t_part_idx on t_part (user_id, test_col) compress global
2 partition by range(user_id) (
3 partition p_default values less than (maxvalue)
4 );

Index created.

SQL> alter index t_part_idx rebuild partition p_default nocompress;

Index altered.

SQL> select
2 partition_name
3 , compression
4 from
5 user_ind_partitions
6 where
7 index_name = 'T_PART_IDX';

PARTITION_NAME COMPRESS
------------------------------ --------
P_DEFAULT DISABLED

SQL>


- Overcoming any of the schema evolution restrictions by temporarily uncompressing / disabling enabled compression


As already mentioned above there are different options how to overcome schema evolution restrictions. One obvious option, although very likely not applicable in many cases simply due to space and load constraints, is to temporarily uncompress the segments and to disable the compression. It is interesting to note that it is not as simple as it seems to undo compression with partitioned tables as I will demonstrate - I think I've even read somewhere that it is not possible at all to revert a table into a state where the mentioned restrictions no longer apply. This is definitely not true.

First of all it is important to note that in addition to the obvious decompression of each partition that contains compressed data the "COMPRESS" attribute that determines if a suitable operation will generate compressed data or not needs to be reset to NOCOMPRESS (but see below for "oddities"). So there is a significant difference between "ALTER ... [NO]COMPRESS" and "ALTER ... MOVE [NO]COMPRESS". The latter reorganizes the segment / partition and compresses/uncompresses the data while doing so, the former just "marks" the segment / partition as "enabled/disabled for compression" but doesn't touch the data and leaves it in whatever state it is at present. As as side note, the "COMPRESSION" column in the dictionary can not be used to distinguish between these two - you can have the column show "DISABLED" with data still compressed, and you can have the column show "ENABLED" with no compressed data and you can end up with a mixture of both - not every block of a segment is necessarily in a compressed state.

Furthermore the "COMPRESSION" attribute can be defined on multiple levels with partitioned tables:

- On the global TABLE level: ALTER TABLE ... [NO]COMPRESS

- On the partition level as default attributes: ALTER TABLE ... MODIFY DEFAULT ATTRIBUTES [NO]COMPRESS

- On the partition level to mark for [NO]COMPRESSION: ALTER TABLE ... MODIFY PARTITION [NO]COMPRESS

- On the partition level with MOVE: ALTER TABLE ... MOVE PARTITION [NO]COMPRESS ... [UPDATE [GLOBAL] INDEXES]

Now in order to be able to overcome the schema evolution restrictions all levels except the partition default attribute level need to be addressed: Any compressed data needs to be uncompressed, but also any partition that has been marked for compression (but doesn't necessarily contain compressed data) needs to be modified as well - finally the global table level also needs to be reset.

The setting on global table level seems to be the crucial step. It looks like Oracle checks during this operation (ALTER TABLE ... NOCOMPRESS) if all subordinate (sub-)partitions have a flag set that marks them as uncompressed. Only if that is the case some internal flag on global table level can also be reset so that the restrictions no longer apply.

There are however some oddities that need to be considered when doing so:

- You need to "move" any partition (ALTER TABLE ... MOVE PARTITION NOCOMPRESS) that has marked for compression if any other partition actually held compressed data - simply "unmarking" it using "ALTER TABLE ... MODIFY PARTITION NOCOMPRESS" is not sufficient even it does not hold any compressed data (for example, has only be marked with "ALTER TABLE ... MODIFY PARTITION COMPRESS").

- If you have BITMAP INDEXES defined on the table with compressed partitions you need to either set the whole indexes unusable or drop them before the COMPRESSION can be effectively disabled on global table level using ALTER TABLE ... NOCOMPRESS

See the following test case for all steps to unwind this:


Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create table t_part
2 nocompress
3 partition by list (user_id) (
4 partition p_default values (default)
5 )
6 as
7 select * from all_users;

Table created.

SQL> alter table t_part split partition p_default values (123)
2 into (partition p_123, partition p_default);

Table altered.

SQL> -- Compress one partition
SQL> alter table t_part move partition p_default compress;

Table altered.

SQL> -- And uncompress it again
SQL> alter table t_part move partition p_default nocompress;

Table altered.

SQL> -- Doesn't work!
SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> -- Compress one partition
SQL> alter table t_part move partition p_default compress;

Table altered.

SQL> -- And uncompress it again
SQL> alter table t_part move partition p_default nocompress;

Table altered.

SQL> -- The global table level was missing!
SQL> alter table t_part nocompress;

Table altered.

SQL> -- Now it works
SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';

Table altered.

SQL> -- Start again
SQL> alter table t_part drop column test_col2;

Table altered.

SQL> -- Compress one partition
SQL> alter table t_part move partition p_default compress;

Table altered.

SQL> -- Mark in addition another partition for compression (note: no data gets compressed!)
SQL> alter table t_part modify partition p_123 compress;

Table altered.

SQL> -- And uncompress the compressed partition again
SQL> alter table t_part move partition p_default nocompress;

Table altered.

SQL> -- Unmark the other partition marked for compression (note: no data was compressed!)
SQL> alter table t_part modify partition p_123 nocompress;

Table altered.

SQL> -- The global table level
SQL> alter table t_part nocompress;

Table altered.

SQL> -- But: Doesn't work!
SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> -- Oddity 1: Although it doesn't contain compressed data
SQL> -- this partition needs to be reorganized as well
SQL> alter table t_part move partition p_123 nocompress;

Table altered.

SQL> -- Don't forget the global table level, otherwise below doesn't work
SQL> alter table t_part nocompress;

Table altered.

SQL> -- Now it works
SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';

Table altered.

SQL> -- Start again
SQL> alter table t_part drop column test_col2;

Table altered.

SQL> -- Compress one partition
SQL> alter table t_part move partition p_default compress;

Table altered.

SQL> -- Add a bitmap index
SQL> create bitmap index t_part_idx_bitmap on t_part (username) local;

Index created.

SQL> -- And uncompress the compressed partition again
SQL> alter table t_part move partition p_default nocompress update indexes;

Table altered.

SQL> -- The global table level
SQL> alter table t_part nocompress;

Table altered.

SQL> -- But: Doesn't work!
SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> -- Oddity 2: Bitmap indexes on compressed tables need to be set unusable or dropped
SQL> -- Note it is not sufficient to set single partitions unusable
SQL> alter index t_part_idx_bitmap unusable;

Index altered.

SQL> -- But still: Doesn't work because the global level has not been touched again
SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';
alter table t_part add test_col2 varchar2(10) default 'BLA'
*
ERROR at line 1:
ORA-39726: unsupported add/drop column operation on compressed tables

SQL> -- The global table level
SQL> alter table t_part nocompress;

Table altered.

SQL> -- Now it works
SQL> alter table t_part add test_col2 varchar2(10) default 'BLA';

Table altered.

SQL>