Configuring transforms
As part of that process, you can apply data transformations to ensure your data is in the correct format. With transformations you can handle null or missing values, scale, categorize, and convert data. Because each field is mapped individually you can map the same input value to multiple fields with different transformation pipelines. You can have a raw value for monthly sales and map that value to a category for another field. You can have rules based on actual sales values and categorized sales values. For example, you can have different decisions for each sales category and within each decision you can have rules based on the actual value.
A transformation pipeline consists of one or more transform functions that are executed in order. The output of one transform function is the input to the next function. The transform is a functional composition.
Each transform function has zero or more user configurable parameters. If a parameter is
required, it is indicated. If it is missing, the transform pipeline is marked as invalid. For
example, if you select Linear
, you can convert degrees
Fahrenheit to degrees Celsius using the formula C = 5/9(F
-32)
. Converting this to the linear form y = a*x + b
,
you have y = 1.55556*x -17.77778
. Specify 1.555556
for the Slope
and 17.77778
for the Intercept
.
Several basic data converters are provided which you can use to specify a value if the input
is missing or not of the correct type. For example, the To
Integer
function attempts to convert the input into an integer value. If it cannot
be converted, you can specify the value to use. For example, you have a status code that
ranges from zero to five. Due to encoding variations sometimes that value is missing, or
represented as a string ("3"), an integer (3), or a double (3.0). The To Integer
function maps "3", 3, and 3.0 to 3, and a missing/null value to a user
provided value.