Validating Your Data Model

Inconsistent, incorrect, or missing metadata can lead to problems when processing data or creating reports. Running the model scan validates your data model and alerts you to any issues. After creating a data model, a Space Administrator can run a model scan from the command line using the validatemetadata command.

Example metadata validation rules include:

Hierarchy names should be unique and level names should be unique within the hierarchy.

The expression for a Custom Attribute or Custom Measure cannot exceed 10 nested IF/ELSE statements as this creates more than 10 levels of nested case statements which is not supported by SQL Server.

To validate your data model against all rules, use the following syntax: validatemetadata all

The model validation scans the repository against a set of validation rules and provides a detailed report that describes any issues that are detected. You can run the scan multiple times against your data model. You can select the areas to check such as data sources, hierarchies, aggregates, and custom attributes or select specific rules to validate against your data model.

Rule Groups

validatemetadata rulegroup ruleGroupName

validatemetadata rulegroup "Data Sources"

Aggregates
Custom Attributes
Custom Measures
Data Sources
- This validates the settings for data sources on the Manage Sources page.
Data Sources - Data Flow
- This validates settings for data sources that are specific to Data Flow, such as Exclude From Warehouse Model.
Hierarchies

Rules

validatemetadata rule ruleName

validatemetadata rule IncrementalSnapshotFact

The following tables detail the validation rules sorted by Rule Group:

Rule Name	Rule	Description
Data Sources
IncrementalSnapshotFact	Incremental snapshot fact	Delete key(s) should be marked when “Bulk Insert and Delete Measure Records” is enabled. And marked delete key should be present as either level key or measure on that stagingtable.
LevelKeyDataTypeLengthCheckRule	Level keys exist and corresponding staging column should be marked as natural key	Level Key(s) of level as well as parent level(s) up to hierarchy should be marked as Natural key on StagingTable at level.
LevelKeyNaturalKeyFlagChecksOnStagingTable	Level key column is present against a hierarchy level with valid Birst data type and length	Level key data type and length (in case of varchar) on corresponding staging table and on dimension table should be same. It is required because joins apply on that column.
StagingColumnSourceColumnMappingChecks	SourceFileColumn referred in StagingColumn should exist in SourceFile.	SourceFileColumn referred in StagingColumn should exist in corresponding SourceFile.
MeasureGrainCreationRule	Measure Grain isn't created as a result of internal naming collision issue	More than one source should not be at the same levelName(s) in different hierarchies. If two sources are on the same levelName(s) then only one fact table gets created instead of two due to name collision.
LevelKeyColSourceColMapping	Level Keys are not removed from Repository after the column is removed from the Source	All the level keys associated with staging table should be existed in corresponding source file associated with staging table. If the above condition is not met then list out those columns.
ValidateETLScript	ValidateETLScript	Session variable is restricted from being used in ETL script. This rule will list down all the scripted sources for which session variable is being used in ETL script.
ValidateDuplicateMeasureNameDatatype	Validate column (marked as measure) having same name but have different data types	Birst tries to create base facts from all the sources during building application. If the same name column is present under different sources with different data types, and marked as a measure (e.g. Integer and DateTime), Birst will store the fact as Varchar. So as best practice, measures with same name should be of same data type across all the sources, or else, measure name needs to be different. This rule will list out those entire same name columns with different data types and marked as a measure.
HierarchyDimensionKeyLevel	Hierarchy Dimension Key Level	Dimension key level is the lowest level in the hierarchy
HigherLevelLevelKeyNaturalKeyFlagChecksOnStagingTable	Day ID, if exists then it should be marked as natural key	Day ID, if exists then it should be marked as natural key
ValidateDayIDColumn	SourceFileColumn field should not exist on "Day ID" staging colun, if transformation is set to V{LoadDate}	SourceFileColumn field should not exist on "Day ID" staging colun, if transformation is set to V{LoadDate}
ColumnNameValidationForSourcesHavingSameGrain	Column_Name Validation For Sources Having Same Grain	Two data sources have the same grain, and have columns whose names differ only by underscores being used in one column name whereas spaces are used in the other column name shouldn't be allowed.
StagingColValidationHavingSameNameAndMarkedOnSameHierarchy	StagingColumn validation having same name and marked on same hierarchy	Data type and width of source column having same name and marked on same hierarchy should be the same
StagingColLevelKeyValidationHavingSameNameAndMarkedOnSameHierarchy	StagingColumn level key validation having same name and marked on same hierarchy	Data type and width of source column level key having same name and marked on same hierarchy should be the same
OutOfRangeKeyIndicesOnSourceFile	Key Indices Validation on Source File	Key Indices mapped on a source file should be less than the number of columns in the source file
Data Source - Data Flow
SourceFieldsValidation	Validate source against ExcludFromModel and Disabled fields	List out sources which are excluded from model and disabled for loading
Aggregates
QueryAggregateNavigationRule	Query aggregate navigation	Birst should be able to navigate the query used in aggregate generation. If not then those aggregate will be listed out by this rule.
Hierarchies
CardinalityCheck	Level cardinality is non-zero positive integer	Cardinality set on each level should be non-zero positive integer
HierarchyLevelRule	Hierarchy name is unique and hierarchy level name is unique within hierarchy	Hierarchy name should be unique and Level name should be unique within hierarchy.
UniqueSurrogateKeyCheck	Availability of unique surrogate key column name	Surrogate key column name should be unique within hierarchy.
Custom Attributes
CustomAttributeRule	Custom Attribute derived from other custom attributes fails on improperly constructed CASE statement in SQL	Custom Attribute cannot be used to derive other custom attribute at the same level.
CustomAttributeExprValidationRule	Custom Attribute should validate and warn me if I exceed the 10 nested IF statement restriction.	Expression of Custom Attribute should not be having more than 10 else if statement. It causes more than 10 level of nested case statement which is not supported by SQL server.
Custom Measures
CustomMeasureExprValidationRule	Custom Measure should validate and warn me if I exceed the 10 nested IF statement restriction.	Expression of Custom Measure should not be having more than 10 else if statement. It causes more than 10 level of nested case statement which is not supported by SQL server.
Model Performance
PerformanceWarnings	Warning rule for model performance, triggering a warning if the data model design is impacting performance. Performance checks are designed to be run against Enterprise spaces. Data Model Properties are not accessible for Professional spaces.	Provides warning messages when a model properties setting may affect performance in the space. The rule throws an error warning message if one of these criteria are met: All Time Series are enabled in a space by default, including both Default Time Series and Extended Time Series The Analyze By Date is enabled for all columns. The rule does not check the properties of any prepared pipeline sources, ignored sources, hidden sources, or Live Access sources in the space.