Transferring Large ETL files

The XMDBTool Client and the XMDBTool background process is enhanced to ensure that the Full ETL runs more efficiently and is easily transferable. Full ETL runs only for some customers who tend to have very huge files, that is more than 45 GB. These huge files take longer time (sometimes resulting in time - outs) to either download to the client machine (using XMDBTool client) or to transfer to the SFTP server (using XMDBToolbkg). So, the solution is to break the huge files and do the actual transfer. Infor Expense Management 10.1 supports these processes:

  • Transferring large files using XMDBTool ELT background process
  • Transferring large files using XMDBTool Client

Transferring Large Files using XMDBTool ETL Background Process

General bkg parameters to run full ETL are:

  • c=RUN_ETL
  • toFile=fulletl.txt
  • deploy_tables=Y
  • mapping_classes=USER_HIERARCHY

These parameters generate an ETL data file, fulletl.txt, that might be huge (around 45 GB or more). A new parameter splitETLFile is added. If this parameter is NOT specified, there is no change in the existing functionality. If this parameter is specified as splitETLFile=true, the ETL file is split into number of small files, with each file representing each table. If the total number of tables (involved in ETL) is 250, 250 smaller ETL files are generated, where each file represents the sql commands (drops, create, inserts etc) for each table. The naming convention of each file is <sequence number>-<table name>-<etl file name>, wherein the sequence number starts from 101 and is incremental, The table name represents the name of the table, and the etl file name is the etl file name that is specified.

For example, if we run the bkg with these parameters:

  • c=RUN_ETL
  • toFile=fulletl.txt
  • deploy_tables=Y
  • mapping_classes=USER_HIERARCHY
  • sftp=true
  • splitETLFile=true

Numerous small etl files like these are created:

  • 101-ADDRESS-fulletl.txt
  • 102-ALLOCATION-fulletl.txt
  • 03-BUSINESS_RULE_VIOLATION-fulletl.txt etc

Each of these files have the related sql commands of drop, create, insert and so on. After all the files are generated, each of these files are transferred (SFTP) over to the client sftp server.

Another significant parameter is fulletlrerun. If this parameter is set to True, the ETL process continues from the point it had stopped earlier, transferring the remaining files to the sftp server; in case errors are generated during the file transfer. If this parameter is set, then first it figures out whether the full ETL has had a complete run before. If this parameter is set to No, the entire ETL process is run again.

For example:

  • c=RUN_ETL
  • toFile=fulletl.txt
  • deploy_tables=Y
  • mapping_classes=USER_HIERARCHY
  • sftp=true
  • splitETLFile=true
  • fulletlrerun=true

Transferring Large Files using XMDBTool Client

To download all the ETL files, you must run this command from XMDBTool client:

C:\xmdbtool\\bin>run_etl.bat -deploy_tables Y -mapping_classes USER_HIERARCHY -to File C:\etl.sql.

This command helps to download all the ETL files from the server, and generates the etl.sql ETL file. There can be time-out issue if the file size is large. To solve this issue, the XMDBTool Client is enhanced with the etlChunk parameter. This parameter splits the large ETL file in the server, into small chunks. Each chunk of the file is downloaded and a single ETL file is generated in the client machine. You must set the etlChunk parameter to true:

C:\xmdbtool\\bin>run_etl.bat -deploy_tables Y -mapping_classes USER_HIERARCHY -to File C:\etl.sql -etlChunk true

With this enhancement, all the ETL files are run, and the ETL file is generated on the server (as before). After the complete ETL file is generated on the server, the files are split into small files (chunks) and then once the big file is split into many small files, each of these small files is zipped and downloaded to the client machine. After all these smaller files are downloaded, the files are merged into a single ETL file and named as specified in the command (based on the above example the file name is etl.sql). As part of the cleanup, all the smaller files are deleted from the client machine after the files are merged.

The etlChunkSize parameter sets the number of lines that must be included in each chunk of the file. By default the parameter is set to 10000, but you can modify the value based on the requirement.

For example:

run_etl.bat -deploy_tables Y -mapping_classes USER_HIERARCHY -toFile C:\Nitesh\test1.sql -etlChunkSize 200000

This command sets the number of lines in chunk of the file to 200000.

If the fulletlrerun parameter is set to true, complete ETL is run again, starting from the point where the process had stopped earlier:

run_etl.bat -deploy_tables Y -mapping_classes USER_HIERARCHY -toFile C:\etl.sql -fulletlrerun true -etlChunk true

If an error occurs in the server, during the process of creating the chunks of a large file, you can run the full ETL process again with the fulletlrerun=true option. This command starts the process of creating the chunks again, instead of rendering the full ETL again.

If the ETL chunked files are not required to be merged, set the value of the dontMerge parameter to true in the xmdbtool.properties file . By default, the value of the dontMerge application parameter is false.

Note:  From version 10.2.15, the ETL bkg can be initiated from the XMDBTool client and the ETL file can be generated in the xmshare folder. By default, the ETL file is generated in the standard folder (apache-tomcat-9.0.x/ETL/<tenant> ) when started from the client or the ETL bkg. We can now pass the xmshare parameter from the client or the ETL bkg and set the value to true. The ETL file will then be generated in the xmshare folder. By default the location of xmshare is ../../xmshare. However, you can specify the location of xmshare in the xmdbtool.xmshareloc application parameter.

The etl bkg can now be triggered from the xmdbtool client batch run_bkg file by specifying xmshare=true in the params parameter. The ETL will run in the bkg server as the ETL bkg process is called and after the ETL run, the file is generated in the xmshare folder , from where the SFTP transfer is done to the specified SFTP address. For example -> run_bkg -bkgName XMDBToolETL.Bkg -params xmshare-_true

Also from version 10.2.15, there will be one individual ETL log file for each run.