Data Lake Input
With the DataLakeInput step you can query payloads or data from Data Lake. The Authorized App configured location in the
kettle.properties
file (${IONAPI_FILE}) serves as the authentication and
connection details to a specific tenant.
To create a DataLakeInput step:
- Click Design > Input.
- Double-click or drag the DataLakeInput step to the Transformation tab on the right side of the Input Table step.
- Double-click the step, the Deem DataLake Input dialog box is displayed.
-
Specify this information:
- Step Name
- Update the step name - Read_{object}. For example: Read_CSYCAL.
- Host
- Hover over the ${IONAPI_FILE} variable to verify if the correct
kettle.properties variable is being used.
You can click Browse to select the authorized app (*.ionapi) file.
- API Object
- Select the Data Lake object for the data flow from the drop-down list.
- Function Name
- Select the function to use while extracting data from Data Lake. The “queryAll” function is the default. This function uses the Data Lake StreambyFilter API to pull payloads from Data Lake. For other function capabilities, see Data Lake input functions.
- Query String
- The Query String text box is empty by default.
You can add other Data Lake properties filters in the Query String field that is appended to the overall query string when executed.
- Click Get cloud-api fields to pull in the data catalog properties for the object.
- Click OK and return to the DataLakeInput step dialog box.
-
Click Get cloud-api fields again.
A set field prefix dialog box can be displayed. Here you can specify a prefix to all fields in the output. This can be useful for M3 clients keeping the same field names in the reporting database as when the data was on premise. For example: CD for CSYCAL.
- Click Preview rows to test if the connection to Data Lake is returning data.
- Click Close to finish the DataLakeInput step.