Data Selection for Localized String Values

The localization functionality supports queries for localized string values, stored in the data objects in Data Lake.

Through the object metadata, specify that a property is localized, and the driver returns the localized values. Localized values are stored in data stores, where they can be used in reports and dashboards.

For details on defining localized string data and metadata, see Data object definitions for localized string values.

Specifically, for Birst, the object query includes the locale keyword to select localized string values. Additionally, a separate table in the Birst data store stores the Data Catalog locale selections. For example, the source application supports these locales:

  • en_US
  • fr_CA
  • es_ES

The Data Catalog locale selections are position 1 for en_US, position 2 for fr_CA, and position 3 for es_ES.

The locale in the first position is used as the default locale. When a query selects a localized value, the driver retrieves the en_US value for table column 1. The fr_CA value is retrieved for table column 2 and the es_ES value for table column 3. The column name does not include the locale code. With the numbered position columns, you can specify which locales are used in Birst.

To complete the localization for Birst, the locale selections are set up as a separate table in the data store. The locale selections are used to match the table data to the report consumer’s language. When a report is rendered, if the report consumer’s language is en_US, the report displays the data in position 1. If the report language is fr_FR, then the report displays data in position 2. If the report language is not available in a position, then the report returns the value in position 1, because it is the default locale.

The driver uses two methods to determine if string properties are localized. Both methods use the document’s property metadata, which is stored in Data Catalog Object Schemas.

The driver reads the source objects for a match to the Data Catalog’s Locale Selection Locale Code Search List. The search list must include the locale code first and can be followed by one or more locale codes to use as substitute locales. For example, if a value for en_US is not found in the data, and the next code in the list is en, the driver returns the en value.

The matching process between the driver, the Locale Selections and Data Lake data happens each time the query runs.

If the Data Catalog Locale Selections change after data is loaded into the target tables, then the target tables can be cleared and repopulated. For example, if position 2 is French, French values are loaded into the target tables. If position 2 changes to Spanish, then subsequent queries populate position 2 with Spanish values. A report run for the Spanish language shows Spanish for current data and French for historic data. The report shows a combination of both, since the current locale selection defines position 2 as Spanish.