Validating Your Data Model

Inconsistent, incorrect, or missing metadata can lead to problems when processing data or creating reports. Running the model scan validates your data model and alerts you to any issues. After creating a data model, a Space Administrator can run a model scan from the command line using the validatemetadata command.

Example metadata validation rules include:

Hierarchy names should be unique and level names should be unique within the hierarchy.

The expression for a Custom Attribute or Custom Measure cannot exceed 10 nested IF/ELSE statements as this creates more than 10 levels of nested case statements which is not supported by SQL Server.

To validate your data model against all rules, use the following syntax: validatemetadata all

The model validation scans the repository against a set of validation rules and provides a detailed report that describes any issues that are detected. You can run the scan multiple times against your data model. You can select the areas to check such as data sources, hierarchies, aggregates, and custom attributes or select specific rules to validate against your data model.

Rule Groups

validatemetadata rulegroup ruleGroupName

validatemetadata rulegroup "Data Sources"

    • Aggregates
    • Custom Attributes
    • Custom Measures
    • Data Sources
      • This validates the settings for data sources on the Manage Sources page.
    • Data Sources - Data Flow 
      • This validates settings for data sources that are specific to Data Flow, such as Exclude From Warehouse Model.
    • Hierarchies

Rules

validatemetadata rule ruleName

validatemetadata rule IncrementalSnapshotFact

The following tables detail the validation rules sorted by Rule Group:

Rule Name

Rule

Description

Data Sources
IncrementalSnapshotFact Incremental snapshot fact Delete key(s) should be marked when “Bulk Insert and Delete Measure Records” is enabled. And marked delete key should be present as either level key or measure on that stagingtable.
LevelKeyDataTypeLengthCheckRule Level keys exist and corresponding staging column should be marked as natural key Level Key(s) of level as well as parent level(s) up to hierarchy should be marked as Natural key on StagingTable at level.
LevelKeyNaturalKeyFlagChecksOnStagingTable Level key column is present against a hierarchy level with valid Birst data type and length Level key data type and length (in case of varchar) on corresponding staging table and on dimension table should be same. It is required because joins apply on that column.
StagingColumnSourceColumnMappingChecks SourceFileColumn referred in StagingColumn should exist in SourceFile. SourceFileColumn referred in StagingColumn should exist in corresponding SourceFile.
MeasureGrainCreationRule Measure Grain isn't created as a result of internal naming collision issue More than one source should not be at the same levelName(s) in different hierarchies. If two sources are on the same levelName(s) then only one fact table gets created instead of two due to name collision.
LevelKeyColSourceColMapping Level Keys are not removed from Repository after the column is removed from the Source

All the level keys associated with staging table should be existed in corresponding source file associated with staging table.

If the above condition is not met then list out those columns.

ValidateETLScript ValidateETLScript Session variable is restricted from being used in ETL script. This rule will list down all the scripted sources for which session variable is being used in ETL script.
ValidateDuplicateMeasureNameDatatype Validate column (marked as measure) having same name but have different data types Birst tries to create base facts from all the sources during building application. If the same name column is present under different sources with different data types, and marked as a measure (e.g. Integer and DateTime), Birst will store the fact as Varchar. So as best practice, measures with same name should be of same data type across all the sources, or else, measure name needs to be different. This rule will list out those entire same name columns with different data types and marked as a measure.
HierarchyDimensionKeyLevel Hierarchy Dimension Key Level Dimension key level is the lowest level in the hierarchy
HigherLevelLevelKeyNaturalKeyFlagChecksOnStagingTable Day ID, if exists then it should be marked as natural key Day ID, if exists then it should be marked as natural key
ValidateDayIDColumn SourceFileColumn field should not exist on "Day ID" staging colun, if transformation is set to V{LoadDate} SourceFileColumn field should not exist on "Day ID" staging colun, if transformation is set to V{LoadDate}
ColumnNameValidationForSourcesHavingSameGrain Column_Name Validation For Sources Having Same Grain Two data sources have the same grain, and have columns whose names differ only by underscores being used in one column name whereas spaces are used in the other column name shouldn't be allowed.
StagingColValidationHavingSameNameAndMarkedOnSameHierarchy StagingColumn validation having same name and marked on same hierarchy Data type and width of source column having same name and marked on same hierarchy should be the same
StagingColLevelKeyValidationHavingSameNameAndMarkedOnSameHierarchy StagingColumn level key validation having same name and marked on same hierarchy Data type and width of source column level key having same name and marked on same hierarchy should be the same
OutOfRangeKeyIndicesOnSourceFile Key Indices Validation on Source File Key Indices mapped on a source file should be less than the number of columns in the source file
Data Source - Data Flow
SourceFieldsValidation Validate source against ExcludFromModel and Disabled fields List out sources which are excluded from model and disabled for loading
Aggregates
QueryAggregateNavigationRule Query aggregate navigation Birst should be able to navigate the query used in aggregate generation. If not then those aggregate will be listed out by this rule.
Hierarchies
CardinalityCheck Level cardinality is non-zero positive integer Cardinality set on each level should be non-zero positive integer
HierarchyLevelRule Hierarchy name is unique and hierarchy level name is unique within hierarchy Hierarchy name should be unique and Level name should be unique within hierarchy.
UniqueSurrogateKeyCheck Availability of unique surrogate key column name Surrogate key column name should be unique within hierarchy.
Custom Attributes
CustomAttributeRule Custom Attribute derived from other custom attributes fails on improperly constructed CASE statement in SQL Custom Attribute cannot be used to derive other custom attribute at the same level.
CustomAttributeExprValidationRule Custom Attribute should validate and warn me if I exceed the 10 nested IF statement restriction. Expression of Custom Attribute should not be having more than 10 else if statement. It causes more than 10 level of nested case statement which is not supported by SQL server.
Custom Measures
CustomMeasureExprValidationRule Custom Measure should validate and warn me if I exceed the 10 nested IF statement restriction. Expression of Custom Measure should not be having more than 10 else if statement. It causes more than 10 level of nested case statement which is not supported by SQL server.
Model Performance
PerformanceWarnings Warning rule for model performance, triggering a warning if the data model design is impacting performance. Performance checks are designed to be run against Enterprise spaces. Data Model Properties are not accessible for Professional spaces. Provides warning messages when a model properties setting may affect performance in the space. The rule throws an error warning message if one of these criteria are met:
All Time Series are enabled in a space by default, including both Default Time Series and Extended Time Series
The Analyze By Date is enabled for all columns.
The rule does not check the properties of any prepared pipeline sources, ignored sources, hidden sources, or Live Access sources in the space.