Data Lake flows

A Data Lake flow is a sequence of activities that results in sending data into Data Lake or sequence of activities starting with retrieval of data from Data Lake.

In a "simple" Data Lake flow, two connection points are involved. One Data Lake connection point sends a specific type of documents, and the other connection point receives those documents. Or a connection point sends a specific type of documents, and the Data Lake connection point receives those documents. Data Lake flows can be more complex.

Some activities are not allowed to be modeled in Data Lake Flow.

See Creating and using data flows.

Asynchronous activities are not allowed to be modeled in the middle of a Data Lake flow. Specifically these connection points:

  • Application (IMS)
  • Application (in-box/outbox)
  • LN
  • CRM Business Extension
  • File
  • API (Send, Read)
  • Database (Read, Send, Request/Reply)
  • Message Queue (JMS)

In addition to connection points, Data Lake flow can contain other items such as:

  • Mappings that are used to translate a document to another format.
  • Scripts that are used for executing custom Python code with a document input.
  • Parallel flows that are used when sending documents from multiple connection points to Data Lake.