Query Data Lake Activity

With Activity you can schedule receiving custom subset of your data from Data Lake, using graphical SQL modeler. You can select certain fields/columns, use filtering, join multiple objects/tables and so on.

  1. Start Data Lake.
  2. Drag and drop Query Data Lake Activity from the toolbar to the Data Lake flow.
    The Query Data Lake activity can be placed only as a first activity in the flow.
  3. On the Properties tab specify a name and description.
  4. On the Queries tab click Add Query.
  5. In the Data Lake Query Modeler you can run these substeps:
    1. Model a query in the same way as in AnySQL modeler.

      Only DSV and newline-delimited JSON Data Lake objects can be used in the modeler.

    2. In Settings, you can select a JSON newline-delimited or JSON conventional output format.

      Unlike in the regular AnySQL modeler you cannot select a specific column in the incremental configuration. The Data Lake storage time information is used automatically.

    3. Define the output format.
    4. Generate metadata for the output document.
    5. Save Modeler.
    6. Click BACK to return to Data Lake Flow.

      For more information about AnySQL modeling see the Infor ION Technology Connectors Administration Guide.

  6. Repeat steps 4 to 6 to add all required queries/documents.
  7. On the Scheduler tab, specify how often the modeled queries must be run.

    If more than 10 documents are defined in a single query Data Lake activity, then the first 10 are run on schedule. Other documents must wait until one of the slots becomes available.

  8. On the Filter tab you can exclude data older than specified date.

    This option works only when incremental table is selected in the modeler.

    The incremental keys can be rewinded to previous time point using the Rewind incremental option from Active Document flows page.

    For more information see Data Lake retrieval activities.

    You cannot rewind to the time point older than the limit specified in the Filter tab.