Configuring Data Characteristics
You can configure the data characteristic of a source.
Note: Data Characteristics can also be set using the
Command Window with the setdatacharacteristics command. This is
necessary for SFTP files.
To configure the data characteristics of a source:
Note: Max allowed line breaks for a single row is set to
1,000. Birst cannot parse files that exceed this limit.
-
Click the Actions icon of the
connector, then click Data
Characteristics.
Note: Some data characteristics are automatically set based on the file type.
-
From the Data Characteristics menu, select the characteristics
to use:
- Only Recognize Quotes at the Start and End of Fields
- If this box is checked, quotes embedded in fields are not treated as the start/end of quoted strings. For example, abc|red"red |green"c would be translated as abc, red"red, green"c rather than abc, red"red, green", c.
- First row contains column names
- If this box is checked, the columns in the first row of the source will be treated as column headers.
- Force number of rows to match header count
- If this box is checked, Birst adjusts the number of columns in the source to match the number of headers.
- Parse Formatted Numbers as Numeric
- Enabled by default. This setting parses all numbers as numeric values that can be used in measures. If a user disables this, all columns in that source with formatted numbers are converted to Varchar. This could impact measures derived from those columns in the space.
- Column Separator
- Enter the character that is used as the column separator in the data source here. By default, the column separator is | (pipe).
- Quote character
- By default, the quote character is set to ". If strings are quoted with a different character in this data source, enter that quote character, for example, '. There are cases where CSV files are ill-formatted and quotes are not properly enclosed throughout the CSV file. This can result into malformed CSV exception. For such cases, there is a provision to parse a CSV without considering any quote characters.
- Character Encoding
- Select the type of encoding to use for this data source here. By default, the encoding is set to UTF-8. This needs to match the format of the incoming data.
- Number of rows to skip at beginning of file
- The number of rows as the start of a file to exclude from the import.
- Number of rows to skip at end of file
- The number of rows as the end of a file to exclude from the import.
- Parser Type
- Birst is able to parse CSV data using two different parsers: Default and the RFC4180. Typically, you can use the Default parser type. In some cases, you may run into the following error message when uploading a file into Birst: {{ 1) Malformed file: Unterminated quoted field at end of CSV line. 2) Too many line breaks within a single data row }} This may be due to the fact that your file is satisfying a RFC4180 CSV file that Birst's Default parser is unable to analyze.
- Click Save.
Related topics
- Administrative Commands
- Using the Command Window