Archiving to Data Lake
In addition to archiving LN data to an archive company, data can be archived to
Data Lake. For the archive sessions, you can indicate if the archive must be sent to
Data Lake or to an archive company. To avoid sending many small files to Data Lake, the
archive data is first collected in an archive-to Data Lake company before being exported
to Data Lake.
Archiving to Data Lake can have these flows:
- Source Company > Archive-to Data Lake Company > Data Lake
- Source Company > Archive Company > Archive-to Data Lake Company > Data Lake
To use the functionality of archiving to Data Lake:
- In the Archiving Parameters (ttdpm6100m000) session, select the Archiving to Data Lake Enabled check box. Also specify the archiving interval.
-
In the
Implemented Software Components
(tccom0500m000)
and
General Company Data (tccom0102s000)
sessions, specify an archive-to Data Lake company and specify the
relationships between the various companies that are used in archiving to Data
Lake.
The relationships can also be viewed in the Archiving Companies (ttdpm6510m000) session.
- In the Settings group of the Archive General Data (tccom0250m000) session, specify the archive-to (Data Lake) company to perform a one-time initial load of general data to the specified company. This ensures that the general data is the same across all companies (source company, archive company, Data Lake archive company).
- In the Archive Workbench (tccom0650m000) session, set the Archive To field to Data Lake Company for the sessions that must be archived to Data Lake and run the archive sessions.
-
Start the Archive Engine, either manually in the
Archive Engine Startup (ttdpm6260m000)
session, or automatically by creating a (periodical) job.
To create a periodical job:
- In the Archive Engine Startup (ttdpm6260m000) session, click Add to Job... through the gear icon.
- In the Add Session to Job (ttaad5102s000) session, zoom on the Job field.
- In the Job Data (ttaad5500m000) session, add a new job by clicking New.
- In the
Job Data (ttaad5100s000)
session, specify these fields:
- Job : The name of the job.
- Description : The description of the job.
- Periodical : Select this check box.
- Period : Specify a period in days or weeks.
- Planned Execution Date/Time : Schedule the job at a quiet time, so outside business hours and outside times when other jobs are run. Also consider the scheduled monthly maintenance window.
- Maximum Duration : Do not specify a maximum duration. The job should not be interrupted because of a maximum duration.
- Save the changes and exit. The new job is defined in company 0.
- Select the new job. When you have returned to the Add Session to Job (ttaad5102s000) session, set the Action on error field to Continue and click OK. The Archive Engine Startup (ttdpm6260m000) session is now added to the new job.
- Close the Archive Engine Startup (ttdpm6260m000) session.
- Switch to company 0 in the Change Company (ttdsk2007m000) session.
- In the Job Data (ttaad5500m000) session, mark the job and click Queue Job on the Actions menu. The job is now queued and will be run when the planned execution date/time is due.
Note: Data from an object group is only published to Data Lake if the last successful run is longer ago than the specified archiving interval. If nothing is archived to Data Lake, for example because the archiving interval is not met, a message is logged. -
Use the
Archive Data Object Monitor (ttdpm6160m000)
session to monitor the progress of the archiving to Data Lake by the
Archive Engine and to retry the archiving in case of
errors.
Archive data can have these statuses:
Status Explanation Initial The archive window is started and the archive run identified, but work has not yet started. Creating The data objects to be sent to Data Lake are created. Sending The created data objects are sent to Data Lake. Reconciling The data objects have been sent and reconciliation is performed to verify if the data is stored in Data Lake. You can view the reconciliation status in the Data Reconciliation (ttdpm5550m000) session. Deleting The data is stored in Data Lake; data and data objects are deleted from the archive-to Data Lake company, except for the general master data. Finished The archiving run is finished; data has been cleaned.
In the LN menu, navigate to Tools > Integration Tools > Data Lake Archiving to view the various sessions that are related to this
functionality.