Data object creation
Data objects are typically created by a source system or Data Lake.
Data object creation by a source system
A data publisher provides data replication modules that accumulate transactions from a
specific period of time. The transactions are stored within a data object, which is later
published to Data Lake. If the data is determined to be replicated
to Data Lake by, for example, a schedule or optimal data object
size, the data publisher can use the Data Lake Flows in ION Connect or Data Lake's Batch API for a direct
upload to Data Lake. The Channel
property in
Atlas and the storage APIs provide information on the ingestion
origin.
Data object creation by Data Lake
When a data publisher streams transactional events in real time with Data Fabric's Streaming Ingestion API, the transactions are accumulated and then micro-batched by Data Lake. Data Lake generates data objects after 5 MB of data are reached or every 10 minutes, whichever happens first. Until then, the transactions are not available in Data Lake, but can be processed immediately, in real time with Stream Pipelines.