Oracle Data-Pump : Optimising LOB Export Performance
LOBs as we all know that are used to store large, unstructured data, such as video, audio, photo images etc. With a LOB you can store up to 4 Gigabytes of data. They are similar to a LONG or LONG RAW but differ from them in quite a few ways.
Category that LOB can think of is
• Simple structured data
• Complex structured data
• Semi-structured data
• Unstructured data
LOB are effective but it comes with performance impact in few of the scenario.
Lately there was a task related to taking a backup of LOB table<non partitioned.> of size around 41 GB where LOB itself took around 33 GB. There are many options to do backup of the table and team started with EXPDP approach and it took very long time around 3 hours and this was not acceptable.
Lob won’t use parallel at all so there was no point in using parallel degree in expdp command.
Since the activity was critical and clock was ticking, somehow I managed to deduce the way to do it in concurrent way i.e. Run the data pump in chunks and execute in concurrent way so that each data pump will work on small chunk rather than full volume. Execution in concurrent way in any case will reduce the time.
Here is the very simple and effective script that I designed and to my surprise it reduced the time line from 3 Hours to 10-15 minutes.
What I did is that I logically divided the table based on rowids block using DBMS function in to 10 chunks and these chunks are internally balanced that is data is divided almost equally among the chunks.
Below snippet executed in 10 threads i.e. 10 data pump threads and each one is getting executed concurrently in background. Resulted in 10 different dumps and took just 10-20 minutes for complete export.
#!/bin/bash
chunk=10
for ((i=0;i<=9;i++));
do
expdp USERNAME/Password@DB_NAME TABLES=LOB_TEST QUERY=LOB_TEST:\"where mod\(dbms_rowid.rowid_block_number\(rowid\)\, ${chunk}\) = ${i}\" directory=DMP dumpfile=lob_test_${i}.dmp logfile= log_test_${i}.log &
echo $i
done
Since there are 10 different dumps; import will happen in chunk only and it will reduce the time during import activity as well.
Off course there are other ways to do it like<in relation to expdp>;
a) DBMS_PARALLEL_EXECUTE where we can define the task and generate the rowids and use those rowids during the export.
b) If table is partitioned then do take the backup partition wise in concurrent mode.
c) Use dbms_rowid.rowid_create.
d) Use Secure file for storing the lob rather than using basic file.