Tuesday 19 April 2016

CLUSTERING FACTOR AND TABLE_CACHED_BLOCKS: ORACLE 12C FEATURE


Week off;  somehow I managed to get time to write on 12C behaviour. Core idea was to gather performance artefacts about concurrent insert scenarios but to surprise I went on to discover clustering factor and its internals. I tested this in 12.1.0.1.0

Have created one table; On initial basis I tried just running regular insert from multiple sessions and to see it how it goes.

create table TEMP_DATA (id number, name varchar2(30)) tablespace POOL_DATA;
  
create sequence TEMP_DATA_seq order;

CREATE OR REPLACE PROCEDURE temp_data_proc AS
BEGIN
FOR i IN 1..100000 LOOP
INSERT INTO TEMP_DATA VALUES (TEMP_DATA_seq.NEXTVAL, 'VINEET');
COMMIT;
END LOOP;
END;
/

now from 2 different sessions I triggered the procedure to load the table.

Sess1: exec temp_data_proc;

Sess2: exec temp_data_proc;

Then created index

create index IND_IXX on TEMP_DATA(id);

Gathered the stats and was really surprised to see the clustering factor of Index.

SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
FROM user_tables t, user_indexes i
WHERE t.table_name = i.table_name AND i.index_name='IND_IXX';

TABLE_NAME                     INDEX_NAME                         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------ ---------- ---------- -----------------
TEMP_DATA                      IND_IXX                               496     200000            116653

Second Approach: Dropped and recreated all the object as mentioned above and ran the insert procedure from single session one by one i.e.

executed 2 times temp_data_proc in same session; the clustering factor here is way less.


SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
FROM user_tables t, user_indexes i
WHERE t.table_name = i.table_name AND i.index_name='IND_IXX';

TABLE_NAME                     INDEX_NAME                         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------ ---------- ---------- -----------------
TEMP_DATA                      IND_IXX                               496     200000               441------------à Drastically reduced.

Drilling down more; and noted that running insert from different session can be a problem as each one work on slight different free list so they might pick different block. While in one session one free list is active and once it is over it picks different free list which automatically deals like you have blocks are accessed in right order.

so while creating Index in case of running multiple insert from different session; oracle will do lot of back and forth while generating clustering factor and hence it generated high clustering factor. In case of same session insert with same volume; blocks were accessed in ordered form and hence it generates less clustering factor.

Index stores rowid and data; now how it increments clustering factor?

Basically rowid is representation of blocks and files etc; so while generating clustering factor it check the reference block and see if it same or different from previous one;
if it is different than it increment it by 1 and so this process go on and on.

So this large clustering factor seems an absurd behaviour as far as concurrent session scenario is concerned;  to come up for this 12C introduced  TABLE_CACHED_BLOCKS ; The purpose of the table_cached_blocks parameter is set using set_table_prefs and it helps to control the computed clustering_factor when running the dbms_stats utility and this column is not present in 11.1.0 to 11.2.0.3; not sure about 11.2.0.4. in Version 11g this basically comes up after applying Oracle patches.

TABLE_CACHED_BLOCKS: default is 1 and max is 255; for my test I used 255 so that there will be a very little scope for clustering factor to go wrong.

TEST case : with TABLE_CACHED_BLOCKS

Drop table  TEMP_DATA;

Drop  sequence TEMP_DATA_seq;

create table TEMP_DATA (id number, name varchar2(30)) tablespace POOL_DATA;
 
  
create sequence TEMP_DATA_seq order;

CREATE OR REPLACE PROCEDURE temp_data_proc AS
BEGIN
FOR i IN 1..100000 LOOP
INSERT INTO TEMP_DATA VALUES (TEMP_DATA_seq.NEXTVAL, 'VINEET');
COMMIT;
END LOOP;
END;
/

exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'TEMP_DATA',pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

next have 2 different sessions where we run the procedure concurrently.

Sess1: exec temp_data_proc;

Sess2: exec temp_data_proc;

Then created index again

create index IND_IXX on TEMP_DATA(id);

Gathered the stats and I see clustering factor is quite similar to single session and is optimal.

SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
FROM user_tables t, user_indexes i
WHERE t.table_name = i.table_name AND i.index_name='IND_IXX';


TABLE_NAME                     INDEX_NAME                         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------------------------ ------------------------------ ---------- ---------- -----------------
TEMP_DATA                      IND_IXX                               496     200000               468


I Hope you liked it..

Enjoy Learning....



8 comments:

Foued said...

Nice post, thank you for sharing.
Foued

Vineet Arya said...

Many Thanks Foued. Glad that you liked it.

Akhand Singh said...

Well detailed explanation, as usual.

Thanks for sharing !!

Joaquin Gonzalez said...

Setting TABLE_CACHED_BLOCKS to the maximum number of sessions inserting in the table at the same time is enough. In this case, setting it to 2 makes clustering_factor good enough. That way, oracle has to "remember" last 2 blocks while gathering stats.

Vineet Arya said...

Thanks Akhand...

Vineet Arya said...

Thanks Joaquin..

Unknown said...

Vineet , I am assuming no behavior change in actual concurrent insert (I mean internal)

Vineet Arya said...

Hello Uday, Yes there is no change in concurrent insert behaviour.

Post a Comment