Data object creation

Data objects are typically created by a source system or Data Lake.

Data object creation by a source system

A data publisher provides data replication modules that accumulate transactions from a specific period of time. The transactions are stored within a data object, which is later published to Data Lake. If the data is determined to be replicated to Data Lake by, for example, a schedule or optimal data object size, the data publisher can use the Data Lake Flows in ION Connect or Data Lake's Batch API for a direct upload to Data Lake. The Channel property in Atlas and the storage APIs provide information on the ingestion origin.

Data object creation by Data Lake

When a data publisher streams transactional events in real time with Data Fabric's Streaming Ingestion API, the transactions are accumulated and then micro-batched by Data Lake. Data Lake generates data objects after 5 MB of data are reached or every 10 minutes, whichever happens first. Until then, the transactions are not available in Data Lake, but can be processed immediately, in real time with Stream Pipelines.