Changing the text analyzer
The text analyzer determines how data is indexed and searched.
You can choose from these options: standard or whitespace analyzer. The standard analyzer is the default analyzer in M3 Function Search although depending on the type of M3BE specific data and preferred searches, a switch to whitespace analyzer might be preferred.
The standard tokenizer provides grammar-based tokenization based on the Unicode Text Segmentation algorithm. The algorithm works well for most languages and is described in the Unicode Standards Annex # 29 website.
For example: The 2 QUICK Brown-Foxes jumped over the
lazy dog's bone.
The statement produces these terms or indexed data: [ the, 2, quick, brown,
foxes, jumped, over, the, lazy, dog's, bone ] wherein Brown-Foxes
is divided
and indexed into two terms.
The whitespace analyzer breaks text into terms whenever it encounters a whitespace character.
For example: The 2 QUICK Brown-Foxes jumped over the
lazy dog's bone.
The statement produces these terms or indexed data: [ The, 2, QUICK,
Brown-Foxes, jumped, over, the, lazy, dog's, bone. ] wherein Brown-Foxes
is
indexed as one term.
Another example is “3/4x14" Tube”. The whitespace analyzer must be used to get the correct result when searching for the slash sign and quotation marks because the standard analyzer algorithm divides terms and removes special characters.
Use this procedure to change the text analyzer.
- Select .
- Click the button.
- Select the text analyzer in the corresponding dropdown menu and click .