Using the data object retrieval APIs

Data Lake is a scalable, elastic object store for capturing raw data in its original and native format. Data Lake provides interfaces for these tasks:

Retrieving a list of data objects stored in Data Lake
Retrieving a single data object by a data object ID
Marking an object as corrupt
Retrieving statistical information about a stored object

When you retrieve data from Data Lake with the Retrieval APIs, we recommend that you follow these best practices:

Use index timestamps, such as dl_document_indexed_date, for incremental retrieval patterns.
Apply a 5-second lag interval from the highest indexed timestamp that is referenced during a call. This is to ensure a comprehensive retrieval of data objects and limitation of incidents of data objects overlooking.
In an API call, sort data objects in an ascending order of an indexed timestamp to allow the last object's timestamp to be referenced.
Whenever possible, refrain from using wildcard searches.
Whenever possible, include dl_document_name as part of your filter for the data objects to retrieve from Data Lake.

Interface and consumption methods are exposed through the Data Lake API Service registered within the Data Fabric Suite in API Gateway. For more information on how to use API Gateway and how to interact with Swagger documentation for the API methods, see ION documentation.

By default, the content in Data Lake is stored, and streamed to clients, in a compressed state. For exceptionally large content retrievals, especially through the /dataobjects/byfilter API, this deflating content method ensures that performance of the gateway and requesting clients remains nominal.

Authorized API applications and RESTful API clients that are used for API testing can advertise supported content encoding to the server. To stream and persist data in a compressed format, the requesting party can configure their request with this request header: Accept-Encoding: deflate. With the identity value in a request HTTP header, clients can stream their requested content with no encoding in place. This setting is typically configured with this request format: Accept-Encoding: identity. Not all clients support the identity value. See your API application or client’s documentation to determine whether these request HTTP header values are supported.

Note: The Data Lake JDBC driver is available only for multi-tenant Cloud customers who use Infor Birst.