Databases in DWD
Staging database
The Staging database is the first stop for data extracted from the data sources. It contains tables that correspond to the structure in the original data source. In addition, the staging database contains a full set of source data or an incremental subset consisting of data changed since the most recent update.
This database acts like a buffer between the operational systems and the data warehouse. It allows all necessary data to be extracted with simple SQL statements without sorting and grouping. The whole work of transformation is done at the data warehouse level.
Installing metadata creates the staging database structure.
Catalog database
The Catalog database, located on the data warehouse server, is a view of all data sources, independent of source system. Most transformations from the original database structures are made in the catalog.
You can define more than one catalog. This allows you to create additional DWD solutions in the same repository. Composers, models and model groups are catalog oriented. A source/operational system and a data mart exist outside and are independent of the catalog.
Datamart database
A data mart is a subset of the data warehouse. A data mart contains data specific to a particular business area and is usually designed to satisfy the information requirements of a limited audience of users. DWD can manage any number of data marts each populated by one or more compiled DWD solutions.
A data mart is typically a star schema. See Star schema.
A data warehouse is defined as the physical data structure of all data marts. In small DWD installations the entire data warehouse could be located in one physical database. In a still smaller DWD installation, one specialized data mart might constitute the entire data warehouse.
Repository database
The DWD Repository stores all metadata related to one or more DWD solutions.
The DWD Repository is an SQL database and resides on the Microsoft SQL Server.
The Repository contains:
-
Technical metadata: Tables, views, columns, and relationships descriptions.
-
Business rule metadata: Descriptions of business rules and always mapped to technical metadata. That is, technical metadata implements business rule metadata.
The repository is built on three independent layers of abstraction. The independence of the layers ensures that with de-coupled layers, any layer can be modified without having to modify the other layers. These are the abstraction repository layers:
-
Source layer: Contains metadata that describes the operational data sources. This metadata describes tables and columns of the data source. This metadata is used to automatically create an Operational Data Store (ODS) in a staging database. All rows are extracted and loaded into the staging database for further transformation for the data warehouse.
-
Catalog layer: Contains all the modeling objects used to design a measurement models for the data warehouse. This layer also compiles, organizes, and creates the code that allows for ETL jobs to run to populate the data warehouse solution.
-
Model layer: Contains the data warehouse metadata organized as models (star schemas) and model groups.
DWD metadata describes the data structure of the sources.
Repository items
This table shows objects stored in the DWD Repository.
Item |
Description |
---|---|
System |
These are data source systems, operational systems, linked systems, catalog systems, and data marts. |
Objects |
An object is a collection of columns. Tables, files, and views are objects. Objects are further typed as source objects, Splitters, Decoders, Mergers, Composers, dimension tables and fact tables. |
Columns |
A column is an attribute of an object |
Member |
A member is an attribute of a multidimensional table or a cell in an OLAP cube. A member is connected to levels in an OLAP hierarchy. |
Process |
These include ETL processes (jobs) and code processes. A code process denotes activities such as when a piece of code runs to do a specific task in DWD. |
Job |
A job is one or more processes executed in a specific order. A background job is a job set up from the Designer and executed automatically in the background, when end users are not working with the solution. |
Relation |
Relations define how objects in the data warehouse are related. Relations exist between fact tables and their dimension tables. For example, an Order Head table and an Order Line table are related for data entry and query purposes. These relations are already set in the data source system, but you may change them during the DWD design process. |
Business measurement model (BMM) |
When compiled into the data warehouse, a BMM is represented by a star schema. Star schema |
Data Mart |
A data mart is a subject-oriented or departmental subset (division) of the data warehouse solution and is compiled as a database on the Microsoft SQL server. |
Internal Technical Metadata |
Metadata used by DWD when you design, compile, execute, and administer a data warehouse solution. These are examples:
|