Data Selection Features
Birst Connect uses the JDBC driver to select data objects from Data Lake. You can use Query-based objects to input a query to retrieve data, sourced from JSON or DSV payload objects, as rows of a result set.
In Birst queries for DSV data objects, omit lines with nothing but whitespace and null values. For example, a line of separator characters between empty values results in an empty row of data when the object is queried. DSV payloads with blank lines, such as lines with only line feed or carriage return characters, are not supported. The data object fails because the row does not contain the required number of fields.
Two query methods are supported:
- A general object query is a select statement to return all properties and data from an object. The Birst query is not visible. A general object query returns the data in the latest object loaded into Data Lake.
- A query-based object selection is a select statement to return all or specific
properties from an object. If a WHERE clause condition does not base the query on the
lastModified
timestamp, the driver returns data from the latest data object in Data Lake. If a WHERE clause condition bases the query on alastModified
timestamp, the query returns data from objects loaded within thelastModified
limits. ThelastModified
value corresponds to the datetime at which the data object was loaded into Data Lake.This is formatted as a string and ISO 8601 'YYYY-MM-DDTHH:MM:SS.mmmZ'
For example:
'2020-12-28T07:00:00.000Z'
Through the Birst connection, use the Edit objects option to display a list of the objects available in Data Lake. The list is comprised of JSON or DSV type object types defined in the Data Catalog.
The driver uses the Data Catalog metadata for the object to identify object names as table names and the object properties as table column names.
Data Lake objects marked as corrupt are automatically excluded from query results.