Anomaly Detection
Anomaly Detection includes grouping features that let you organize data based on specific attributes. This makes it easier to detect unusual patterns within particular groups or categories. It is especially handy when analyzing trends across different segments.
On top of that, the explainability features help you understand what’s causing these anomalies, making the results clearer and easier to trust.
You can define these hyperparameters:
- Input Keys
-
Specifies a comma-separated list of features or columns in the input data that must be included. These features are essential for the model to index and retrieve results and must be present in the input data. Ensure that the specified columns are correctly named and formatted in your dataset.
Example: Date
-
- Target Variable
- Defines the name of the column in the original input data representing the target
variable. This is the value you want to detect outliers for. Make sure the target
variable is numeric and properly preprocessed to avoid any inconsistencies.
Example: Price
- Defines the name of the column in the original input data representing the target
variable. This is the value you want to detect outliers for. Make sure the target
variable is numeric and properly preprocessed to avoid any inconsistencies.
- Detection method
- Indicates the method used for anomaly detection. It specifies the technique or
algorithm employed to identify anomalies within the dataset. Choose the method that best
suits the characteristics of your data and the type of anomalies you expect to
detect.
The available options are: two_sided_moving_median, one_sided_moving_median, distribution_based
Example: two_sided_moving_median
- Indicates the method used for anomaly detection. It specifies the technique or
algorithm employed to identify anomalies within the dataset. Choose the method that best
suits the characteristics of your data and the type of anomalies you expect to
detect.
- User Tagged
- Defines the name of the column that denotes whether the value is not an outlier. It serves as a flag indicating whether the data points have been tagged by the user as non-outliers. A value of 1 means this value is not an outlier. Ensure that this column is binary (0 or 1) and accurately reflects the tagging.
- Example: user_tagged
- Group By
- Specifies a comma-separated list of fields used for grouping within the dataset. It
determines the level at which we want to detect anomalies. Grouping helps in identifying
anomalies within specific segments or categories of the data.
Example: LocationId
- Specifies a comma-separated list of fields used for grouping within the dataset. It
determines the level at which we want to detect anomalies. Grouping helps in identifying
anomalies within specific segments or categories of the data.
- Date Column
- Specifies the name of the column in the input data that contains date information. It
identifies the column representing dates within the dataset in YYYY-MM-DD format. Ensure
that the date column is correctly formatted and free of missing values.
Example: Date
- Specifies the name of the column in the input data that contains date information. It
identifies the column representing dates within the dataset in YYYY-MM-DD format. Ensure
that the date column is correctly formatted and free of missing values.
- Handling Method
- Specifies the method used for handling anomalies within the dataset. It determines how
anomalies are treated. Choose the method that aligns with your data handling strategy
and the impact of anomalies on your analysis.
The available options are: smooth, median, remove.
Example: smooth
- Specifies the method used for handling anomalies within the dataset. It determines how
anomalies are treated. Choose the method that aligns with your data handling strategy
and the impact of anomalies on your analysis.
- Advanced
- Provides additional parameters for advanced usage of the engine. It is designed to
make the interface easier to use so you do not need to wonder about parameters that can
be set internally. It may include various settings or configurations tailored for
specific use cases or scenarios, such as window size settings. This parameter is only
used by two thirds of the detection methods thus it is not required all the time.
Example: window_size:'14'.
- Provides additional parameters for advanced usage of the engine. It is designed to
make the interface easier to use so you do not need to wonder about parameters that can
be set internally. It may include various settings or configurations tailored for
specific use cases or scenarios, such as window size settings. This parameter is only
used by two thirds of the detection methods thus it is not required all the time.