File template properties
When configuring the file template, the properties that are shown in this table must be defined:
File Type | Format Type | Property | Description |
---|---|---|---|
All Types | All Types | Name | Name to be shown as the label for the File template configuration. This field is required and must be unique across all templates that are defined. |
All Types | All Types | Description | Short description of the template configuration. |
All Types | All Types | File Type | Specify type of content in the file. Supported file types: Text & Binary. |
All Types | All - except Full BOD | Use Attributes for File Name | This determines whether to store the original file name and extension in the created documents when reading files or not. This option is not selected by default. If select- ed, then two properties are enabled to specify the name of the attributes that carry the file name and file extension. These 2 properties are File Name Attribute and File Extension Attribute. |
All Types | All - except Full BOD | File Name Attribute | This property is available if Use Attributes for File Name is selected. This specifies the name of the attributes that are added to the noun tag of the created BOD when reading files. It holds the file name of the file read. The name that is specified in this field must match the rules of the XML attributes. The File Name Attribute and File Extension Attribute properties must be different - otherwise BOD generation fails. |
All Types | All - except Full BOD | File Extension Attribute | This property is available if Use Attributes for File Name is selected. This specifies the name of the attributes that are added to the noun tag of the created BOD when reading files. It holds the file extension of the file read. The name that is specified in this field must match the rules of XML attributes. The File Name Attribute and File Extension Attribute properties must be different - otherwise BOD generation fails. |
All Types | All - except Full BOD | File Path Attribute | This property is available if Use File Attributes is switched on. It carries the value of folder path as defined in Read Locaton. It is added to the noun tag as attribute of the created BOD when reading files. |
Text Only | All Types | File Encoding | The character encoding for the text file content. Supported encoding are UTF-8 and ISO8859-1. |
All Types | All Types | Format Type | The type of formatting for the file content. Text files
can be one of these types; Delimited, Fixed-Length, Fixed- Length
& Delimited, Full BOD or XML. Binary files are of type Raw Data. |
Text Only | Delimited / Fixed-Length & Delimited | Field Separator |
This specifies a separator character between fields when
Delimited or Fixed-Length & Delimited is specified in the
Format Type property. When rendering text, each element in the
input BOD data schema is separated by the Field Separator in the
output text string. When parsing text, each field becomes an
element in the output BOD data schema. Only one character can be
specified or The Field Separator cannot be the same as or a subset of Line Separator or Optional Value Indication. |
Text Only | Delimited / Fixed-Length / Fixed-Length & Delimited | Line Separator | This specifies the character(s) that determine the end of
each line. When parsing text, each line is treated as a new record
in the output BOD data schema. When rendering text, each data record
is separated by the line separator character in the output text
string. Supported escaped characters: The Line Separator cannot be the same as or a subset of Field Separator or Optional Value Indication. Limitation: Currently you cannot generate a BOD
for files that use literal |
Text Only | Delimited | Field Enclosing Character | This specifies the characters used to enclose a field. Each field can have a start enclosing character and an end enclosing character. The start and end character can be the same or different. When parsing the field, all data within properly enclosed characters is treated as valid content. The enclosing characters are not considered part of the data. See Enclosing Character. |
Text Only | Fixed-Length / Fixed-Length & Delimited | Fill Character | This is the type of character that is used to fill the
blank space in a field and between fields. Only one character can be
specified \t . |
All Types | Delimited / Fixed-Length / Fixed-Length & Delimited / Raw Data | Data Fields | This specifies the data schema for the input or output
text. All data fields that are defined are on the same level;
hierarchy structure is not supported. For each data field, a name
must be defined. For fixed-Length formats, the field length
(characters) must be specified. The field sequence is displayed; to
change the order of the fields, use the up and down arrows. Field
names must be unique. Each data field has an Optional flag. Selecting this flag makes the field optional. All cleared fields are treated as required. See Optional Fields. For Raw Data, there are only two data fields that are defined. One for Document ID and the other for the Raw Data content. You can set the labels for these two fields, but cannot delete these fields or add more fields. Field names must be unique. |
Text Only | Delimited | Optional Value Indication | The value which can be substituted to indicate a field is
optional. Only one indicator is allowed per file. Any field that is
not present in a row must be represented by this value. These values are allowed for the Optional value indicator:
Note that, the onus is still on the application to represent the optional field with the appropriate indicator and its delimiter. This is the behavior of ION file parser when a field is marked as optional and when the optional value indicator is left blank:
In all cases an appropriate delimiter corresponding to that field position is expected. A record with less number of delimiters than the defined number of fields is treated as an invalid file. The optional value indicator cannot be:
|
Text Only | XML | Sample XML |
The |
Text Only | Full BOD | Document | Select one of the available BODs in ION Registry. Any BOD can be selected; standard or custom. |
All Types | All - except Full BOD | BOD Noun | This specifies the noun of the BOD that is generated - based on the file template configuration specified. The BOD noun cannot be similar to one of the existing ION standard BODs. In addition, if the BOD Noun that is specified matches an existing Custom BOD, a warning message is displayed. The user can choose to overwrite the existing BOD. |
Text Only | Delimited / Fixed-Length / Fixed-Length & Delimited | Generated BOD | This specifies how records in text are used to create BOD instances or how BOD data area is written to text:
|
Text Only | XML | Include XML header during writing the files | XML Header specifies the version number and optionally
the character encoding. This is part of a grammar document's XML
declaration on the first line of the document. If present, this
header must be shown on the first line of all XML documents.
Selecting this property enables ION to automatically add the XML header to the
XML files when written. Note:
This setting is only applicable when ION is writing a XML file to a destination. When ION is reading a XML file, it automatically removes the XML header line if encountered. This is necessary because during the 'read file' operation, the XML file is embedded into the Noun section of the data area of the BOD. Because retaining the XML header inside a section of another XML makes the resulting BOD as an invalid XML. |
After the configuration of a format template is completed and saved, click
to start the BOD generation steps.BOD generation applies to Delimited, Fixed-Length, Fixed-Length & Delimited and XML and Raw Data Format types. For Full BOD, no BOD is generated as it already exists in the registry.
- In case all configurations are valid, a window is displayed to browse and select a sample file which content matches the file template configuration defined. This is used to validate the configuration that is defined against the provided sample content. This applies to Delimited, Fixed-Length and Fixed-Length & Delimited format types.
- After it is validated, another window is displayed; listing the BOD schema elements and the option to change the data type of the elements. A Document ID element must be specified to complete the BOD generation steps. For Raw Data files, the Document ID is already selected. If Use Data Fields for File Name is selected, the attributes for File Name and File Extension are listed as well. When you click , information is displayed about the structure of the generated BODs.
After these steps are completed, a BOD is generated, noun and verbs schema. This BOD is stored in ION Data Catalog as a custom BOD and linked to the file template. This BOD represents the schema to use for rendering and parsing text content.
The defined data types in the generated schema are more strict than shown in the generate UI. Define your example file with values in the appropriate range. These are the value ranges:
- -128 and 127. This range is generated as "byte".
- -32768 and 32767. This range is generated as "short".
- -2147483648 and 2147483647. This range is generated as "int".
- -9223372036854775808 and 9223372036854775807. This range is generated as "long".
Custom BODs that are generated from File Format Templates can be managed through Custom Documents on the menu in ION Desk.
The contents of the noun in the DataArea section must match the file format template that is defined when the type of files is:
- Delimited
- Fixed-Length
- Fixed-Length & Delimited
Otherwise a confirm BOD is generated.