Choosing appropriate methods of ingestion

To ensure that your product's solution requirements are met with the appropriate ingestion method, consider these options:

  • Adopt the Batch Ingestion API for data integration use cases in which the direct Data Fabric ingestion capabilities are required. The Batch Ingestion API enables the direct transfer of large file sizes and archived data. Alternatively, you can use the Streaming Ingestion method, which allows for publishing real-time data from the source.
  • If, for example, your application is BOD-capable, or you are building an integration footprint into your solution, adopt ION as the integration method for sending data to Data Lake

This table shows the use cases to help you choose the appropriate method of ingestion:

Method Use case
Batch Ingestion

Use this method in these cases:

  • Your data source is an application that can use REST API to send data objects.
  • The data to send is created or read in batches. For example:
    • Publishing data on request
    • Publishing an initial load
    • Publishing batch-based business workstreams, such as MRP
  • The data to send is archived data objects.
Streaming Ingestion Use this method in these cases:
  • Your data source is an application that can use the WebSocket technology.
  • There is a continuous stream of changes.
  • There is a high volume of data that can cause too many API invocations.
  • There are near real-time requirements for operational reporting with Stream Pipelines.
ION Data Lake Flows

Use this method in these cases:

  • Your data source is a connection point in ION.
  • Your data must be transformed before ingesting to Data Lake, for example, with ION Scripting or enrichment activities within Data Flows.
  • Multiple application subscribers require published replication data in addition to Data Lake.
ION Data Loader Use this method when migrating on-premises data that is stored within database tables, views, and materialized views to Data Lake.
Atlas Upload widget Use this method in these cases:
  • For a one-time upload of data that is not published or retrieved directly from an application.
  • For an upload of data for testing.
Caution: 
Do not send the same data through different ingestion methods in parallel. This can cause data duplication in Data Lake.