Data Publishing Parameters (ttdpm5130m000)

Use this session to define the parameters for publishing to Data Lake.

For details on how to configure publishing in hybrid deployment, see the Infor LN Enterprise Server Library (Secured) (On-premises) and select Administrator > Enterprise Server > Configuration with Infor OS (Cloud) > Configuring LN for publishing to Data Lake.

Buttons

These buttons are available:

Check IMS Connection: Checks if publishing to IMS with the current connection parameters is possible.
This button is available only if IMS is supported and is valid for the IMS Connection parameters.
Check Catalog Connection: Checks if publishing to Data Catalog with the current connection parameters is possible.
This button is only valid for the Data Catalog Connection parameters.
Check Ingestion Connection: Checks if publishing to Data Lake with the current connection parameters is possible.
This button is only valid for the Batch Ingestion Connection and Streaming Ingestion Connection parameters.

Field Information

Publication Method for Initial Load

Data Lake Ingestion

If this check box is selected, the messages are published directly to Data Lake through the Data Lake Batch Ingestion API.

If this check box is selected, the Message Size (in MB) field in the Publish Data (ttdpm5205m000) session is automatically increased to 50 MB.

If this check box is cleared, data cannot be published to Data Lake.

Note: In an IGS environment, you can select one of these methods:

Data Lake Ingestion
The messages are published directly to Data Lake through the Data Lake Batch Ingestion API and not through ION. In this way, it is not required to model a data flow in ION.

If this option is selected, the Message Size (in MB) field in the Publish Data (ttdpm5205m000) session is automatically increased to 50 MB. This is because the Data Lake Batch Ingestion API can handle larger messages than IMS.
ION Messaging Service
Data is published from LN to ION. You must model document flows in ION to distribute the data to its final destination.

If this option is selected, the Message Size (in MB) field in the Publish Data (ttdpm5205m000) session is limited to 5 MB, because this is the maximum file size allowed by IMS.

Publication Method for Changes

Data Lake Ingestion

If this check box is selected, the messages are published directly to Data Lake using the Data Fabric Streaming Ingestion API.

Because the Streaming Ingestion API uses its own batching mechanism within the Data Fabric architecture, micro-batching is not used in LN and the data is sent to the API with due diligence.

For information about the arrival time window and other limits defined by the Data Fabric Streaming Ingestion, see the Infor OS User and Administration Documentation Library (Cloud) and select User > Data Fabric > Sending data to Data Lake > Streaming Ingestion > Micro-batching of streamed data.

If this check box is cleared, changes cannot be published to Data Lake.

Note: In an IGS environment, you can select one of these methods:

Data Lake Ingestion
The messages are published directly to Data Lake using the Data Fabric Streaming Ingestion API and not through ION. In this way, it is not required to model a data flow within ION.

Also, because the Streaming Ingestion API uses its own batching mechanism within the Data Fabric architecture, micro-batching is not used in LN and the data is sent to the API with due diligence.

For information about the arrival time window and other limits defined by the Data Fabric Streaming Ingestion, see the Infor OS User and Administration Documentation Library (Cloud) and select User > Data Fabric > Sending data to Data Lake > Streaming Ingestion > Micro-batching of streamed data.
ION Messaging Service

Data is published from LN to ION. You must model document flows in ION to distribute the data to its final destination.

The changes are not published immediately, but they are batched into larger messages during a user-defined period of time before publishing. This is called micro-batching. The period during which the changes are batched and the message size are specified in the Micro-Batching Parameters in the current session.

Note:

The available options for the Publication Method for Initial Load and the Publication Method for Changes are interdependent. You cannot select Data Lake Ingestion for changes if it is not selected as the publication method for the initial load.
If the LN data must be replicated to a location other than Data Lake, then the LN data must first be sent to the Data Lake from where the data can be distributed to other locations.
For cloud environments, the connection point parameters are not displayed. These parameters are defined on system resources at the Landlord level.

Micro-Batching Parameters

These parameters are available only if IMS is supported.

Micro-batching is used to reduce data fragmentation and optimize storage in Data Lake, which helps enhancing system performance when retrieving data objects or querying Data Lake using Compass queries.

Micro-batching is done by grouping data into batches by table. The batch size is determined by the time interval specified in the Maximum Delay field and the message size specified in the Minimum Message Size field.

A JSON message containing LN data is published if the maximum delay time has passed or the minimum message size has been reached.

Micro-batching is mandatory for the publication of changes to prevent too much data fragmentation and to provide a better performance in Data Lake.

The initial load process automatically groups all data belonging to one table to a size-limited message. Therefore, micro-batching is not applicable for either of the publication methods for the initial load.

Is Enabled

If this check box is selected, JSON messages containing LN data are not generated and sent by transaction, but the transactions are grouped by table into larger messages.

If this check box is cleared, micro-batching is not enabled. This is the case when changes are published using the Data Fabric Streaming Ingestion API.

Note: If the micro-batching parameters are changed, the data publishers must be restarted to take effect.

Maximum Delay

The time interval during which data is published. The time interval is expressed in minutes. The recommended interval is 10 minutes. For example, if you specify 10 minutes, a message containing data is published once every ten minutes, unless the minimum message size has been reached before the time interval has passed.

A message is generated and sent if the maximum delay time has passed or if the minimum message size has been reached, it depends on which of these conditions is met first.

Minimum Message Size

The minimum message size is expressed in MBs. The recommended message size is 5 MB. A message is published when the data volume has increased up to the specified message size, unless the maximum delay time has passed before the minimum message size has been reached.

A message is generated and sent if the maximum delay time has passed or if the minimum message size has been reached, it depends on which of these conditions is met first.

If the ION Messaging Service is the publication method for changes, the standard message size is set to 5 MB and this field is disabled.

Number of Transformers

The number of transformers per publisher bshell. The transformers combine the data from individual transactions into micro-batched data, including the determination of the calculated fields.

Number of Micro-Batch Publishers

The number of micro-batch publishers per publisher bshell. The micro-batch publishers publish the messages from LN to IMS or Data Lake.

Periodical Reset

The frequency of the reset, in which data with status Error is reset. After the reset, publishing is retried. The recommended frequency is 30 minutes.

Data Message Settings

The data message settings are used to add additional data to JSON messages that are published to Data Lake. The additional data is used to support Process Mining in Data Lake.

Enable previous value to be included

If this check box is selected, you can use the Configure Fields for Publishing (ttdpm5120m000) session to specify the fields for which both the new and the previous value is to be published.

A previous value is represented by this key-value pair: "previous_[field code]":"[previous value]".

To start the Configure Fields for Publishing (ttdpm5120m000) session, complete these steps:

Navigate to the Data Sets to Publish (ttdpm5105m000) session.
Open the details of the relevant data set.
Select a table in the satellite of the Data Set to Publish (ttdpm5105m100) session.
From the appropriate menu, select Configure Fields.

Include insert flag

If this check box is selected, each JSON message publishes a new key-value pair that shows the nature of the related transaction.

The key-value pair for this information is: "inserted":[true|false], where true = new record, and false = updated record. If the message contains "inserted":false,"deleted":true, this means that the record was not updated, but deleted.

Include session code

If this check box is selected, each JSON message publishes a new key-value pair that shows the session that was used to commit the related transaction. The key-value pair for this information is: "session_code":"[session code]".

Note:

After changing any of the previous check boxes, this prompt is displayed:

“You have changed the Data Message Settings. This will trigger a republish of all meta data to Data Catalog. Do you want to continue?”

After clicking Yes, which closes the prompt dialog, and clicking OK in the upper left of the session, this message is displayed:

“Please Convert to Runtime in the Data Sets to Publish (ttdpm5105m000) session to activate the changes made to the Data Message Settings.”

Converting to runtime is required for the changes to take effect. This process also publishes a new object schema to the Data Catalog because the new key-value pairs must be included in the JSON structure. This affects the Compass queries and you may need to run a stored procedure in Data Fabric to notify Compass of these new fields.

Staging of Calculated Fields

Staging of calculated fields is used to specify whether and how to stage/store calculated fields when the data is written to the database. At the time of publication, the staged values are published. If values are not staged, they are determined upon publication.

If the calculated field values are determined upon publication, then these values cannot be determined correctly if the data used for them is changed or deleted.

Select one of these staging options:

Never Stage
The values of calculated fields are never staged. They are determined upon publishing.
Stage at Delete
The values of calculated fields are staged for records that are deleted. The staged values are published for deleted records. For records that are inserted or updated, the values of calculated fields are determined upon publishing.
Always Stage
The values of calculated fields are always staged. The staged values are published.

Note:

The selected option applies to all calculated fields. You cannot specify a different staging option for individual calculated fields.
If values for calculated fields are determined upon staging, then the processes that generate the data may be slower.
For the performance of LN Analytics, it is recommended to select Stage at Delete .