(A) OGG logic architecture
Referring to the above figure, I will briefly introduce the logical architecture of OGG, so that you can have a simple understanding of the data synchronization process of OGG. The following sections will demonstrate the configuration method of related processes in detail. The use of OGG mainly involves the following processes and documents:
Manager process: The source and the target need to run at the same time, and its main functions are to monitor and manage other processes, report errors, allocate and clean up data storage space, and issue threshold reports.
Extraction process: it runs at the source end of the database and is mainly used to capture the changes of data and is responsible for the extraction of total and incremental data.
Trajectory file: a data file temporarily stored on disk.
Data extraction process: it runs at the source end of the database and belongs to the auxiliary process of the extraction process. If no data pump is configured, the extraction process will send the extracted data directly to the trace file on the target side. If a data pump is configured, the extraction process will extract the data to the local trace file and then send it to the target through the data pump process. The main advantage of the configuration data extraction process is that the extraction process will not be terminated even if the network from the source end to the target end is interrupted.
Collector process: receives data changes transmitted by the source and writes them to the local trace file.
Replicat process: read the data changes recorded in the Trail file, create corresponding DML statements, and play them back at the target end.
Second, the migration plan
(1) Environmental information
OGG version ogg12.2.0.2.2 for Oracle ogg12.2.0.2 for MySQL?
Database version Oracle11.2.0.4 MySQL 5.7.21.
OGG _ Home/Home/Oracle Bone Inscriptions/Og /opt/ Og
(B) table structure migration
The migration of table structure is not difficult, but the content is complicated. We used an open source tool named sqlines when migrating the table structure, and then specially handled the table structure that failed to be created by sqlines tool on MySQL side and did not meet the expectations, which improved the efficiency of table structure transformation.
Note: In the scenario of Oracle migrating MySQL, OGG does not support DDL statement synchronization, so try not to modify the table structure after table structure migration and before database switching.
(3) Data migration
All data synchronization operations are performed by OGG tools. Considering the relationship between the total amount of data and increment, OGG needs to start the extraction process of incremental synchronization first, grab the redo log of the database, start the incremental data playback after the total amount is extracted, and apply the log data generated during the total amount and increment. OGG can configure repeated data processing according to parameters, so when using OGG, first configure and enable the increment. In addition, in order to avoid the length of this chapter, OGG parameters will not be explained, and friends who need it can check the official reference document to query any parameters you don't know.
1. OGG configuration of source location
(1)Oracle database configuration
For Oracle database, OGG needs the database to open archive mode, add auxiliary supplementary logs, and force logging to ensure that OGG can capture complete log information.
Check whether the current environment meets the requirements, and the output result is shown in the following figure:
(2)Oracle database OGG user creation
Ogg requires users to have the right to operate related objects in the database. The following permissions are involved. This example will create an Oracle database user with OGG user name and password, and grant the following permissions.
(3) source OGG management process (MGR) configuration
(4) Configure the source OGG table-level completion log (trandata)
Table-level completion log only works when the minimum completion log is turned on. Previously, there were only minimal completion logs (alter database adds supplementary log data; ), the information recorded by redolog is not comprehensive enough, and you must use add trandata to open the table-level completion log to get the necessary information.
(5) OGG extracts the process configuration from the source.
The extraction process runs on the source side of the database and is responsible for capturing data from the source table or the source log. The extraction process uses its inherent checkpoint mechanism to periodically check and record its read and write locations, usually writing to local trace files. This mechanism is to ensure that if the extraction process is terminated or the operating system is shut down, after we restart the extraction process, GoldenGate can return to the previous state and continue running from the last breakpoint without losing any data.
(6) Configuration of source OGG conveying process (pump)
The pump process runs on the source side of the database, and its function is very simple. If the local trace file is used in the extraction process of the source end, the pump process will send the trace file to the target end in the form of data blocks through TCP/IP protocol. The pumping process is essentially a special form of extraction process. If the trail file is not used, the extraction process will directly send the data to the target after extraction.
Supplement: When the pump process starts, it needs to connect to the mgr process at the target end, so it is necessary to configure mgr at the target end in advance, otherwise it will report that the connection is rejected and the extracted log file cannot be transferred to the directory corresponding to the target end.
(7) Generate a source OGG heterogeneous mapping file (defgen).
This file records the table structure definition information of the table to be copied in the source library. After the source library generates this file, it needs to be copied to the dirdef directory of the target library. When the copy process of the target library applies the transmitted data to the target library, it needs to read and write this file, and the homogeneous database does not need to do this operation.
2. OGG configuration at the target end
(1) MySQL database configuration on the target side
Confirm that MySQL end table structure already exists.
MySQL database OGG user creation
Mysql & gt creates the user "ogg" identified by "ogg" @ "%";
Mysql & gt grants full permissions on *. * to' ogg' @'%
# # # # Create a database for ogg in advance to store the checkpoint table.
Mysql & gt creates database ogg
(2) Configuration of OGG Management Process (MGR) at the target end
The target-side MGR process is the same as the source-side configuration, and the source-side configuration method can be directly repeated at the target-side, so the details are not repeated here.
(3) Configure the OGG checkpoint log table at the target end.
Checkpoint table is used to ensure that after a transaction is executed, there is a table in MySQL database to record the current log playback point, which is similar to the GTID or position point of MySQL replication record binlog.
# # # # Switch to the ogg software directory and execute ggsci to enter the command line terminal.
Shell & gtCD $ OGG _ home page
shell & gtggsci
Ggsci & gt edit parameters. /global
Checkpoint table ogg.ggs_checkpoint
ggsci & gtdblogin sourcedb ogg @ 17x . 1x . 84 . 12 1:3306 userid ogg
Ggsci & gt Add checkpoint table ogg.ggs_checkpoint
(4) OGG playback thread (replicat) configuration at the target end.
The Replicat process runs on the target side and is the last stop of data delivery. It is responsible for reading the contents of the Trail file at the target end, parsing it into DML statements, and then applying it to the target database.
# # # # Switch to the ogg software directory and execute ggsci to enter the command line terminal.
Shell & gtCD $ OGG _ home page
shell & gtggsci
# # # # Add a playback thread and associate it with the trail file transmitted by the source pump process, and use the checkpoint table to ensure that the data is not lost.
Ggsci & gt Add replicat r_cms, exttrail /opt/ogg/dirdat/ms, checkpoint tableogg.ggs _ checkpoint.
# # # # Add/Edit Playback Process Profile
Ggsci & gt edit parameter r_cms
Copy r_cms
Target dbcms @17x.1x.84.121:3306, userid ogg, password ogg.
sourcedefs/opt/ogg/dirdef/CMS . def
Discardfile/opt/ogg/dirrpt/r _ cms.dsc, append, megabyte 1024.
Handle collision
Map cms. *, target cms. *;
Note: the replicat process only needs to be configured and does not need to be started. It will start after the complete extraction is completed.
At this point, the source environment configuration is complete?
After extracting all the data, you can start the target playback process and complete the quasi-real-time synchronization of the data.
3. Fully synchronous configuration
Full data synchronization is a one-time operation. When deploying OGG software and configuring and starting the incremental extraction process, you can configure 1 special extraction processes to extract data from the table, save the extracted data to the target end to generate files, and the target end starts a single-run replicat playback process to parse and play back the data to the target database.
(1) Total OGG extraction process configuration at the source end
# # # # Switch to the ogg software directory and execute ggsci to enter the command line terminal.
Shell & gtCD $ OGG _ home page
shell & gtggsci
# # # # Add/Edit the profile of the total extraction process
# # # # where RMTFILE specifies that the extracted data is directly transferred to the remote corresponding directory.
# # # # Note: The file specified by the RMTFILE parameter only supports 2 characters, and it will not be recognized if it exceeds replicat.
Ggsci & gt edit parameter ei_cms
Source table
SETENV(NLS _ lang = "America _ America. AL32UTF8”)
SETENV (ORACLE_SID=cms)
SETENV(ORACLE _ HOME =/data/ORACLE/ 1 1.2/db _ 1)
USERID ogg@appdb, password ogg
rmt host 17x . 1x . 84 . 12 1,MGRPORT 7809
RMTFILE /opt/ogg/dirdat/ms, maxfiles 100, megabyte 1024, clear.
Form cms. *;
# # # # Start and check whether the extraction process is normal.
Shell & gt nuohupu /extract paramfile。 /dirprm/ei_cms.prm report file. /dirrpt/ei _ CMS . RPT & amp;
# # Check whether the log is completely extracted normally.
shell & gttail -f ./dirrpt/ei_cms.rpt
(2) OGG complete playback process (replicat) configuration at the target end
# # # # Switch to the ogg software directory and execute ggsci to enter the command line terminal.
Shell & gtCD $ OGG _ home page
shell & gtggsci
Ggsci & gt edit parameter ri_cms
Special operation
End running time
Target dbcms @17x.1x.84.121:3306, USERID ogg, password ogg.
EXTFILE /opt/ogg/dirdat/ms
Abandoned documents. /dirrpt/ri_cms.dsc, clear
Map cms. *, target cms. *;
# # # # Start and check whether the playback process is normal.
Shell & gt nuohupu /replicat paramfile。 /dirprm/ri_cms.prm report file. /dir RPT/ri _ CMS . RPT & amp;
# # # # Check whether the full-text playback of the log is normal.
shell & gttail -f ./dirrpt/ri_cms.rpt
Third, data verification.
Data verification is an essential link in the process of data migration. This chapter provides several ideas and parameters of data verification, and the verification method can be realized from the following angles:
1. Check whether the discarded records in the process of total amount and increment are 0 through OGG log, and judge whether the data is lost;
2. Judging whether the data statistics are consistent by counting the tables of the source end and the target end;
3. Write a program similar to the principle of pt-table-checksum to realize line-level consistency checking. This method has obvious advantages and disadvantages. The advantage is that you can check the data content completely and accurately, but the disadvantage is that you need to traverse every line of data and the inspection cost is high.
4. The relatively compromise data verification method is to write dozens of SQL with faster results in advance from the business perspective, and conduct sampling inspection from the business perspective.
Four. Deal with immigration problems
This chapter will talk about some problems encountered in the migration process and the corresponding solutions.
(A) MySQL restrictions
There are two main limitations in the process of table structure migration from Oracle to MySQL:
The table structure of 1.Oracle was not strictly designed at first, and a large number of columns used varchar(4000) data type, which led to the table structure exceeding the row limit after migrating to MySQL, and it was impossible to create the table structure. Due to the limitation of MySQL's own data structure, a data page of 16K needs to store at least two rows of data, so a single row of data cannot exceed 65535 bytes, so there are two solutions to this situation:
According to the length of the actual stored data, shrink the extra-long varchar column;
For columns that cannot be shrunk, the data type is text, but this may cause some performance problems during use;
2. Similar to the first point, in Innodb storage engine, the index prefix length is limited to 767 bytes. If dynamic and compressed row formats are used and innodblargeprefix is turned on, this limit is 3072 bytes, that is, when using utf8mb4 character set, only columns of varchar(768) can be indexed at most.
3. When using ogg to completely initialize synchronization, if there are foreign key constraints, because the insertion order of each table is not unique during batch import, it may happen that the sub-table inserts data first and the main table has not been inserted, resulting in the failure of the records on which the sub-table depends. Therefore, it is recommended to disable primary and foreign key constraints during data migration and open them after migration.
mysql & gtset global foreign _ key _ checks = off;
(B) Total convergence and incremental convergence
HANDLECOLLISIONS parameter is the key to realize the connection between OGG total data and incremental data. Its principle is to start the incremental extraction process before the total extraction, capture the generated redo logs during the full application process, start the incremental playback process after the full application is completed, and apply incremental data during the full application process. After using this parameter, the incremental playback of DML statements mainly includes the following scenarios and processing logic:
There is no record of the delete statement on the target side. Ignoring this problem will not be recorded in the discarded file.
The target side lost the update record.
-the primary key value is updated, and update is converted into insert.
-The updated key value is not a primary key. Ignoring this problem will not be recorded in the discarded file.
The target repeatedly inserts the existing primary key value, and the replicat process will convert it into a line that updates the existing primary key value.
(3) OGG version selection
In the selection of OGG version, we also changed the OGG version several times according to the user's scene. At first, because the customer's Oracle database version is 1 1.2.0.4, we tend to use OGG version. However, in the process of use, it is found that every time the trail file generated by data extraction reaches about 2G, OGG reports the connection interruption. Looking at the detailed description of RMTFILE parameters, we know that the default limit of Trail file is 2G. Later, we replaced the OGG version with 12.3, and used the MAXFILES parameter to control the generation of multiple trace files of the specified size. In the process of playback, the copy process can automatically read the trace files in turn, which finally solves this problem. However, if the Oracle environment unfortunately uses the Linux 5 version of the system, then your OGG needs to be reduced to a smaller version, and the highest version can only be OGG 12.2.
(4) No primary key table processing
Another difficulty encountered in the migration process is that there are a large number of tables without primary keys on the Oracle side. Tables in MySQL have no primary key, which is almost not allowed, because it is easy to cause performance problems and master-slave delay. At the same time, there are some hidden dangers in the process of OGG immigration. For example, for a table without a primary key, OGG will, by default, piece together all the columns in this row of data as unique keys, but there may still be duplicate data, which may lead to data synchronization anomalies. Oracle Bone Inscriptions officials have also provided a solution to this problem. By adding a GUID column to the table without using the primary key as the unique mark of the row, the specific operation method can be found in the MOS document ID1538+0538.300505076.3005050806
(5) OGG safety rules
Error reporting information
20 19-03-08 06: 15:22 ? Mistakes? OGG-0 120 1? Error reported by MGR: Access denied.
The meaning of the error message is that the extraction process needs to communicate with the target mgr process, but it is rejected. The specific operation is: the extraction process of the source side needs to communicate with the target mgr, and the target replicat is started remotely, but the connection is refused due to the current security.
Report the cause of the error.
After Oracle OGG 1 1 version, new security requirements have been added. If it is necessary to remotely start the replicat process on the target side, an access control parameter should be added to the mgr node to allow remote invocation.
solution
Add access control rules on the source and target mgr nodes respectively, and then restart.
# # indicates that the mgr node allows all types of programs (PROG *) of (allow) 10. 186 network segment (IPADDR) to access the access rules Prog *, IP Addr 10. 186. * *, allowed.
(vi) Data extraction methods
Error reporting information
20 19-03- 15 14:49:04 ? Mistakes? OGG-0 1 192? Attempting to use RMTTASK (table: "UNIONPAYCMS") for data types that may be written as LOB blocks. CMS _ OT _ CONTENT _ RTF’)。
Report the cause of the error.
According to official documents, the initial-load method that directly extracts data from Oracle database and writes it into MySQL does not support LOBs data type, but the table UNIONPAYCMS. CMSOTCONTENT_RTF contains CLOB fields, so it cannot be transmitted, and this method does not support field data types beyond 4k.
solution
Change the RMTTASK in the extraction process to the RMTFILE parameter. The official suggestion is to extract the data into a file first, and then initialize the import based on the file data analysis.