Data Migration Tool

Status 13.09.2023 OMN 5.12. For latest version see corresponding GIT repository omn-migration-tool

Data Migration Tool is a java application for data migrations which supports the following databases: - MSSQL - PostgreSQL - Oracle DB

User guide

The application folder must contain DataMigrationTool-1.0.jar (version may be different) and data-migration-tool-config.xml.

For the first run on the target database, make sure that migrated-tables.txt file does not exist (delete if it was created earlier). To restart the application, this file is needed to continue the migration process.

To build project and create a jar file, launch a terminal in the Data Migration Tool project directory and execute the following command:

mvn clean package

Configuration setup

Set up the data-migration-tool-config.xml file for connecting to databases and application configuration.

  1. \<source> - information about connection to the database where data should be received from.

    • jdbcConnectionString - connection url (required parameter)

    • schemaName - schema name in the database to be migrated (required parameter)

    • username - username in database

    • password - user password in database

  2. \<target> - information about connection to the database where data should be inserted to.

    • jdbcConnectionString - connection url (required parameter)

    • schemaName - schema name in the database to be migrated (required parameter)

    • username - username in database

    • password - user password in database

  3. \<excludes> - enumeration of tables and columns that shouldn’t be migrated.

  4. \<batchSize> - definition of the maximum number of elements for a single INSERT query. The default value is 10000 by benchmarking the migration application.

  5. \<threadsNum> - definition of the number of parallel threads for migration process. It is best to use the same number as the number of logical cores in the processor of the computer used for migration.

  6. \<sqlFileGenerationMode> - definition of a mode for sql files generation. Possible values are:

    • NONE - SQL file will not be generated;

    • SINGLE - single SQL file will be generated with all INSERT queries;

    • MULTIPLE - separate SQL files with INSERT queries will be generated for each table. Will be stored in queries folder.

Example:

<migrationConfig>
    <source>
        <jdbcConnectionString>jdbc:jtds:sqlserver://mssql_server_url:1433/MSSQL_DB;integratedSecurity=true</jdbcConnectionString>
        <schemaName>dbo</schemaName>
        <username>mssql</username>
        <password>mssql</password>
    </source>
    <target>
        <jdbcConnectionString>jdbc:postgresql://postgres_server_url:5432/POSTGRES_DB</jdbcConnectionString>
        <schemaName>public</schemaName>
        <username>postgres</username>
        <password>postgres</password>
    </target>
    <excludes>
        <excludeTable>V_TEXTUNITS</excludeTable>
        <excludeColumn table="texts">xdtd_id</excludeColumn>
        <excludeColumn table="v_elements">text_xdtd_filename</excludeColumn>
    </excludes>
    <batchSize>10000</batchSize>
    <threadsNum>10</threadsNum>
    <sqlFileGenerationMode>NONE</sqlFileGenerationMode>
</migrationConfig>

Running the migration process

Launch a terminal in the application folder and execute the following command:

java -jar .\DataMigrationTool-1.0.jar

The console and logs folder will display the progress of the migration process. The description of log files created during operation is described below in the current document.

In case of successful configuration and launch of the application, similar messages will be printed:

INFO   2022-11-25 12:07:40,348 main - Tool started
INFO   2022-11-25 12:07:40,524 main - Configuration was read successfully
INFO   2022-11-25 12:07:51,295 main - 370 tables were found for migration
INFO   2022-11-25 12:07:51,296 main - Migrating the table dbo.OBJECT_PRODUCTLANGS (1/370)...
INFO   2022-11-25 12:08:00,196 pool-1-thread-1 - 10000/30125667 records were migrated for the table 'OBJECT_PRODUCTLANGS'...

Description of files created during migration

  • DataMigrationTool-1.0.jar - java application to launch.

  • data-migration-tool-config.xml - configuration file.

  • migrated-tables.txt - list of fully migrated tables. Used when restarting the application to resume the migration process.

  • /logs/migration_Date_Time.log - Logs of the migration process.

  • /logs/excluded-tables_Date_Time.log - List of excluded tables during migration. The description of this list is described below in the current document.

  • /queries folder - folder with generated INSERT queries. Creation of folder and file with queries depends on option \<sqlFileGenerationMode> in the configuration.

  • /queries/data-migration-insert-all.sql - main file with query calls for all sql files for table.

  • /queries/table1.sql - named sql file with INSERT queries for the table.

Description of the list of excluded tables

At the beginning of the migration process, the application generates a list of excluded and skipped tables for migration to the target database. The list is sorted alphabetically for readability.

The list is made up of the following: - excluded table names in data-migration-tool-config.xml file, \<excludeTable> definitions; - previously migrated tables whose names were saved in migrated-tables.txt file; - missing tables in the target database that were found in the source database.

Information about triggers and foreign keys

During the migration process, triggers are disabled for each table in the target database, then INSERT queries are performed. After that, triggers for the table are enabled.

In case the data in tables has not yet been completely migrated, the triggers will remain enabled, but the validity checks in the database are not run until the triggers are called directly.

Information about errors during the migration process and restarting a stopped application

The application contains handling of known errors when working with data, files, or connection problems.

In case of connection problems with the source or target database, reconnection attempts will be launched. Every minute, for each running thread, an attempt will be made to reconnect to the database.

However, the application may be terminated due to some unknown error that occurred during operation. Or manually stopped the process in the console.

While the migration application is running, file migrated-tables.txt is filled with the names of successfully migrated tables. Each table has a separate transaction with INSERT queries to the database. Therefore, the data of the migrated tables is preserved during the migration process, and does not depend on the end of the application.

When the application is restarted, the tables from migrated-tables.txt file will be skipped for migration. And the migration process will run for all other tables.

Developer guide

The project uses Java version 8.

Packages structure:

de.apollon.datamigrator
  - dto
     - config
  - exception
  - service
     - configreader
        - impl
     - dataaccess
        - impl
     - migrationmemorizer
        - impl
     - report
        - impl
  - utils

de.apollon.datamigrator package contains Runner.class, which is the main Java file for the project (with main() method).

de.apollon.datamigrator.dto package contains all the objects for data transfer, e.g. objects with metadata of tables and columns and objects with configuration in the de.apollon.datamigrator.dto.config package.

de.apollon.datamigrator.exception package contains all the exceptions used in the project.

de.apollon.datamigrator.service package contains the packages of the following services: - ConfigReaderService - service for configuration reading. Provided implementation for XML reading. - DataAccessService - service for DB data manipulation. Provided implementations for PostgreSQL, MSSQL and Oracle DB (is not completed yet). - FileMigrationMemoriserService - service for saving and reading data about migrated tables. - ReportService - service for writing SQL queries into files. Provided implementations for writing into single file, writing into separate files and mock implementation (if query shouldn’t be written).

de.apollon.datamigrator.utils - package contains project’s utilities classes, e.g. ConfigurationContext.class and DataAccessContext.class that store specific objects to make them accessible from any place of the application.

Services structure

Each service package contains a service interface, a service factory and a package with implementations of the service. Service factory is a class with loadService() static method which returns a specific implementation of the service.

Welcome to the AI Chat!

Write a prompt to get started...