Set Data Characteristics

You can configure the data characteristic of a source.

Note: Data Characteristics can also be set Using the Command Window with the setdatacharacteristics command. This is necessary for SFTP files. For more information on the setdatacharacteristics, see Administrative Commands.

To configure the data characteristics of a source, click the More Options icon and then Data Characteristics.

From the Data Characteristics menu, indicate your selections for the following items. Note: Some data characteristics are automatically set based on the file type.

  • Only Recognize Quotes at the Start and End of Fields: If this box is checked, quotes embedded in fields are not treated as the start/end of quoted strings. For example, abc|red"red |green"c would be translated as abc, red"red, green"c rather than abc, red"red, green", c.
  • First row contains column names: If this box is checked, the columns in the first row of the source will be treated as column headers.
  • Force number of rows to match header count: If this box is checked, Birst adjusts the number of columns in the source to match the number of headers.
  • Parse Formatted Numbers as Numeric: Enabled by default. This setting parses all numbers as numeric values that can be used in measures. If a user disables the checkbox, this will convert all columns in that source with formatted numbers to Varchar which could impact measures derived from those columns in the space.
  • Column Separator: Enter the character that is used as the column separator in the data source here. By default, the column separator is | (pipe).
  • Quote character: By default, the quote character is set to ". If strings are quoted with a different character in this data source, enter that quote character, for example, '. There are cases where CSV files are ill-formatted and quotes are not properly enclosed throughout the CSV file. This can result into malformed CSV exception. For such cases, there is a provision to parse a CSV without considering any quote characters.
  • Character Encoding: Select the type of encoding to use for this data source here. By default, the encoding is set to UTF-8. This needs to match the format of the incoming data.
  • Number of rows to skip at beginning of file: The number of rows as the start of a file to exclude from the import.
  • Number of rows to skip at end of file: The number of rows as the end of a file to exclude from the import.
  • Parser Type: Birst is able to parse CSV data using two different parsers: Default and the RFC4180. Typically, you will want to use the Default parser type. In some cases, you may run into the following error message when uploading a file into Birst: {{ 1) Malformed file: Unterminated quoted field at end of CSV line. 2) Too many line breaks within a single data row }} This may be due to the fact that your file is satisfying a RFC4180 CSV file that Birst's Default parser is unable to analyze.
Note: Max allowed line breaks for a single row is set to 1,000. Birst will not able to parse files that exceed this limit.