Data streaming
The Data Streaming Ingestion method involves sending JSON messages in real time and acknowledging responses by indicating whether the messages were successfully received or failed.
This table shows the Streaming Ingestion properties to include in a JSON message:
Property | Description |
---|---|
objectName | A data object name from the Data Catalog. The string can contain maximum 250 characters. Cannot be a null or empty string. |
correlationId | GUID of a streamed JSON message that can be correlated in a response
message. The string can contain maximum 250 characters. |
fromLogicalId | An identifier of a source application in this format:
lid://<provider>.<part2>.<part3> Do not use infor in the logical ID fields. |
payload | A single, complete record that is encoded in Base64. Cannot be a null or empty string. |
Streaming Ingestion has these limitations:
- A payload record must be in the
Newline-delimited JSON (NDJSON)
format. - The payload size cannot exceed 4.5 megabytes.
- The maximum ingestion throughput rate is restricted to 100 messages per second. This is a limitation for the API Gateway streaming ingestion endpoint.
After you send a message to the Streaming Ingestion service, an acknowledgment response is
expected. Only a positive acknowledgment response that contains an OK
message
guarantees that the message is eventually stored in Data Lake. You
can use the correlationId
property to match requests with their corresponding
responses.
An acknowledgment response is sent for every message that was successfully received. If an acknowledgment response is not received within 5 minutes, we recommend that you resend the message.
The Streaming Ingestion service responds with an error result and error code if a message was not successfully received.
This example shows a JSON message request to send to Streaming Ingestion:
{
"objectName": "streamTest",
"correlationId": "1625044086",
"fromLogicalId": "lid://provider.myapplication.client1",
"payload":"eyJuYW1lIjoiSm9obiIsICJhZ2UiOjMwLCAidmFyYXRpb24iOiAxfQ=="
}
This example shows a successful acknowledgment response from the Streaming Ingestion service:
{"result":"ok","correlationId":"1625044086"}
This example shows an error response from the Streaming Ingestion service:
{"result":"error","code":"InvalidMessageFormat","message":"Property payload is not base64 encoded.", "correlationId":"1625044086"}
The table shows possible errors and messages from the Streaming Ingestion service if an exception occurs:
Error | Message |
---|---|
InvalidMessageFormat | These messages are possible:
|
JSONDeserializationError | Invalid JSON message format . |
TooManyRequests | The throughput rate limit was exceeded . |
AcknowledgmentTimeout | Your message cannot be acknowledged within the 55 seconds
timeframe . |
PayloadTooLarge | Payload size exceeds the maximum size of 4718592 bytes |
UnknownError | Unknown Error |
For optimal performance, we recommend that you implement a flow control mechanism to regulate
the rate of streaming messages. With the flow control mechanism, you can maximize the number
of messages that are processed per second and avoid receiving the
TooManyRequests
error. We also recommend that you prioritize maximizing
message throughput and simultaneously maintain the system stability through implementing the
mechanism.