Using the data object retrieval APIs

Data Lake is a scalable, elastic object store for capturing raw data in its original and native format. Data Lake provides interfaces for these tasks:

  • Retrieving a list of data objects stored in Data Lake
  • Retrieving a single data object by a data object ID
  • Marking an object as corrupt
  • Retrieving statistical information about a stored object

When you retrieve data from Data Lake with the Retrieval APIs, we recommend that you follow these best practices:

  • Use index timestamps, such as dl_document_indexed_date, for incremental retrieval patterns.
  • Apply a 5-second lag interval from the highest indexed timestamp that is referenced during a call. This is to ensure a comprehensive retrieval of data objects and limitation of incidents of data objects overlooking.
  • In an API call, sort data objects in an ascending order of an indexed timestamp to allow the last object's timestamp to be referenced.

Interface and consumption methods are exposed through the Data Lake API Service registered within the Data Fabric Suite in API Gateway. For more information on how to use API Gateway and how to interact with Swagger documentation for the API methods, see ION documentation.

By default, the content in Data Lake is stored, and streamed to clients, in a compressed state. For exceptionally large content retrievals, especially through the /dataobjects/byfilter API, this deflating content method ensures that performance of the gateway and requesting clients remains nominal.

Authorized API applications and RESTful API clients that are used for API testing can advertise supported content encoding to the server. To stream and persist data in a compressed format, the requesting party can configure their request with this request header: Accept-Encoding: deflate. With the identity value in a request HTTP header, clients can stream their requested content with no encoding in place. This setting is typically configured with this request format: Accept-Encoding: identity. Not all clients support the identity value. See your API application or client’s documentation to determine whether these request HTTP header values are supported.

Note:  The Data Lake JDBC driver is available only for multi-tenant Cloud customers who use Infor Birst.