Current location - Loan Platform Complete Network - Big data management - How to improve datapump operation performance
How to improve datapump operation performance
I. DataPump Parameters Affecting DataPump Related DataPump Parameters

access_method

In some cases the method selected by the Data Pump API does not provide fast access to your dataset. In this case you can't know which access method is more efficient except by explicitly setting this parameter to test each access method. This parameter has two options direct_path and external_table

cluster=n

This can significantly speed up basic Data Pump API operations in a RAC environment. Note that this parameter only works for Data Pump API operations, and in a RAC environment it is recommended that this parameter be set to n. The impact of setting parallel_force_local to true is not limited to Data Pump API operations

data_options=disable_ append_hint

It is just the impdp parameter, which is safe to use in very specific cases and may reduce the time to import data. Use the data_options=disable_append_hint parameter only if all of the following conditions are met.

1. The import operation will import data into an existing table, partition, or subpartition

2. The number of existing objects that will be imported is very small (for example, 10 or less)

3. When the import operation is executed, other sessions execute only select statements for those objects that will be imported.

The data_options=disable_append_hint parameter is only available in 11.2.0.1 and later. Using data_option=disable_append_hint only saves time if it takes a long time to lock objects freed by other sessions.

estimate

The estimate parameter has two mutually exclusive options, blocks and statistics.Using the blocks method to evaluate the size of the dataset when performing an export operation takes longer than using the statistics method. However, the size of the dataset evaluated using the blocks method is more accurate than the size of the dataset evaluated using the statistics method. If evaluating the size of the exported file is not the primary concern, it is recommended to use estimate=statistics.

exclude=comment

In some cases, end users do not need the comments corresponding to the columns and object types, and if these data are ignored, the DataPump operation will take less time to execute.

exclude=statistics

Excluding and exporting statistics will shorten the overall export operation if the excluded include parameter is not required. dbms_stats.gather_database_stats procedure generates statistics after the data is imported into the target database. The DataPump operation may hang indefinitely when performed by the DataPump engine in parallel with any other RDBMS session to generate statistics for a small table. For DataPump operations that run for more than an hour or more, consider disabling the database's automatic statistics collection task In order to temporarily disable the 11g automatic statistics collection task so that DataPump operations do not compete with it, execute the following command as the sys user:

exec dbms_auto_task_admin. admin.diable (client_name=>'auto optimizer stats collection',

operation=>null,window_name=>null);

After the DataPump operation is complete Restart the statistics collection task:

exec DBMS_AUTO_TASK_ADMIN.ENABLE (client_name => 'auto optimizer stats collection', operation => NULL, window_name = > NULL);

To temporarily disable the 10g auto stats collection task so that the DataPump operation does not compete with it, execute the following command as the sys user:

exec sys.dbms_scheduler.disable ('GATHER _STATS_JOB');

Restart the statistics collection task after the DataPump operation completes:

exec sys.dbms_scheduler.enable ('GATHER_STATS_JOB ');

network_link

Using this parameter will effectively limit the parallelism of the DataPump API, and unless your network throughput and network bandwidth is better than the local device, using network_link will be much slower than using an export file. For DataPump API performance, since it tends to be much slower than dump file operations, it is only recommended that network_link be used as a last resort. Instead of network_link, consider using a mobile or ****-enabled device to store the dump file to perform the data migration.

parallel

Parallel execution will have a positive impact on performance if there are multiple CPUs in use that are not CPU-bound or disk I/O-bound or memory-bound and do not use multiple dump files in the dumpfile parameter. If the parallel parameter is set to N,N>1, it is recommended that the dumpfile parameter be set no smaller than the parallel parameter for better use of parallel execution.

It should be noted that the parallel parameter is an upper limit on the number of concurrent Data Pump worker processes that the DataPump API can use, but the DataPump API can potentially use fewer Data Pump worker processes than specified in this parameter, depending on bottlenecks in the host environment. The value specified for the parallel parameter is less than the number of CPUs available, and the basic Data Pump API operation may be faster.

query

Using the query parameter significantly increases the load of any DataPump API basic operation, and this overhead is proportional to the amount of data in the table being queried.

remap_*

Using any remap_* parameter significantly increases the load of any DataPump API basic operation, and this overhead is proportional to the amount of data in the table being queried.

II. Relevant database parameters that affect the performance of DataPump operations

aq_tm_processes=0

When this parameter is explicitly set to 0, it may negatively affect advanced queue operations, which in turn negatively affects the basic DataPump operations that utilize advanced queues. This parameter can be recovered or set to a value greater than 0

deferred_segment_creation=true

Only for import operations, this will eliminate the time spent allocating space for empty tables. Setting this parameter for export operations will not have a significant impact on performance. This parameter is useful in version 11.2.0.2 or later.

filesystemio_option=...

Database instances will perform write operations to the ACFS file system under certain circumstances. Specify the nature of the type of write operation performed by the Data Pump API as part of an export operation. Any parameter value other than NONE may cause the export operation to slowing down the export operation.

NLS_CHARACTERSET=... and NLS_NCHAR_CHARACTERSET=...

When there is a discrepancy between these two parameters between the source and target databases, the import operation cannot be performed at any time for the specified partitioned table. Use multiple DataPump worker processes to create partitioned tables and populate them. In some cases, only one DataPump worker process can perform operations on the table data, which will obtain an exclusive lock on the table to prevent any other DataPump worker process from performing operations on the same table. When partitioned tables do not have exclusive locks, you can use multiple DataPump worker processes to operate simultaneously to significantly improve the performance of importing data to partitioned tables.

NLS_COMP=... and NLS_SORT=...

In some rare cases, these two parameters are set to binary for databases, which can dramatically improve the speed of basic DataPump API operations. It is important to test whether setting these two parameters to binary improves performance in your environment. Setting these two parameters at the session level after the session has logged in can be accomplished with the following login trigger.

CREATE OR REPLACE TRIGGER sys.expdp_nls_session_settings AFTER LOGON ON DATABASE

DECLARE

V_MODULE VARCHAR2 (60);

BEGIN

SELECT SYS_CONTEXT ('USERENV', 'MODULE') INTO V_MODULE FROM DUAL;

IF UPPER (V_MODULE) LIKE 'UDE%'

THEN

BEGIN

EXECUTE IMMEDIATE 'ALTER SESSION SET NLS_COMP=''BINARY'';

EXECUTE IMMEDIATE 'ALTER SESSION SET NLS_SORT=' 'BINARY'';

END;

END IF;

END;

/

parallel_force_local=true

It is possible to dramatically improve the performance of the basic operations of the DataPump API in RAC environments and to avoid bugs in parallel DML operations. This parameter can only be used with version 11.2.0.2 or later.

streams_pool_size

To avoid bug 17365043 'STREAMS AQ: ENQUEUE BLOCKED ON LOW MEMORY WHEN REDUCING STREAMS_POOL_SIZE'

It is recommended that you set streams_pool_size to the following query value. pool_size to the result value returned by the following query

select 'ALTER SYSTEM SET STREAMS_POOL_SIZE='||(max(to_number(trim(c.ksppstvl)))+67108864)||' SCOPE=SPFILE;' '

from sys.x$ksppi a, sys.x$ksppcv b, sys.x$ksppsv c

where a.indx = b.indx and a.indx = c.indx and lower(a.ksppinm) in ('__ streams_pool_size','streams_pool_size');

_memory_broker_stat_interval=999

If resize operations in your slow DataPump environment consume a lot of time in your slow DataPump environment, then setting this parameter will reduce the frequency of resize operations, which in turn reduces the amount of time the resize operation spends delaying other operations over a specified time span. This is because the DataPump API relies on a large number of streaming functions to aid in export and import operations. It is recommended that this parameter be set to 999 if the streams_pool_size parameter has been explicitly set and resize operations occur frequently.

Three. Table DDL-level parameters that affect DataPump performance

network_link+securefiles

The network_link parameter will make the move operation very slow when moving a table that contains lob columns and the lob is intended to be used as securefiles, and it will make the move operation very slow when moving a table that contains lob columns that are intended to be used as securefiles using the network The network_link parameter generates a large amount of undo data when moving a table containing a lob column that is intended for use with securefiles. The reason for this is that distributed transaction allocation requests are limited to one block at a time across database links, which means that large data set transfers will generate more transfers.

securefiles (without network_link)

Using the securefiles storage format to store LOB columns allows tables containing lob columns to use parallel execution for export and import

Using the basicfiles storage format to store LOB columns does not allow tables containing lob columns to use parallel execution for export and import

Using the basicfiles storage format to store LOB columns does not allow tables containing lob columns to use parallel execution for import and export. tables that contain lob columns are not allowed to be exported and imported using parallel execution

Four. Table DML Level Parameters Affecting DataPump Performance

Competition between a DataPump operation and another session accessing a database object (typically locks on tables, rows of data)

The DataPump engine waits for the other session to release row and table locks held by the DataPump engine before performing export and import on the relevant tables. The DataPump engine will wait for row and table locks held by other sessions to be released before performing an export operation, whereas typical export tools do not wait. Therefore, exporting a table that is being updated frequently is slower than exporting a table that is not currently being updated

.