SQL Canuck: Fact Table Column Types

I promised last week to continue the discussion of the various column types that are found in dimension and fact tables.

I explained the types of columns found in dimension tables and that blog can be found here: (http://mssqlcanuck.blogspot.ca/2015/10/dimension-table-column-types.html).

I also have a previous blog about the basics of dimensional modelling and it helps to explain what dimension and fact tables are. (http://mssqlcanuck.blogspot.ca/2015/08/dimensional-modeling-concepts.html)

Fact tables are really what make the dimensional model so efficient and quick for reporting and data analytics because it contains granular measurements of important business processes. The fact table combined with the dimension tables create an ideal environment for fast and accurate data analysis.

I mentioned in the blog about dimension column types that it can be over whelming to hear all the names of the data warehousing concepts such as lineage, slowly changing dimensions, surrogate keys, natural keys, composite keys, fact-less fact tables etc. I’ve tried to break down these concepts into smaller blogs so they are easier to digest.

Fact Table Column Types

These are the types of columns you will find in a fact table.

Measure – This is the essence of a fact table and they store measurements associate with a specific business process. They are usually numeric and can be aggregated. Examples are sales amount, order quantity, discount amount. These values can often be additive or semi-additive and when associated with relevant dimensions can be rolled up to represent aggregations at various granularity.

Foreign keys – These keys create a relationship between the dimension tables and the fact tables. Without the relationship to dimension tables the measures would be meaningless. I often use the example of 28kg of rice. 28 by itself is ambiguous but when you add kg (unit of measure dimension) and rice (product dimension) it becomes much more meaningful. The fact table is on the many side of the relationship. All the foreign keys in a fact table usually create a unique row in the fact table and can be considered a composite primary key.

Surrogate key – Sometimes you will include a surrogate key to act as the primary key. This key is usually the business key from the source table. For performance reasons is best if the key is short and consists of only 1 or 2 columns and preferably numeric. Often you could just use a collection of foreign keys that point to dimension tables to act as a composite primary key but this can be confusing and difficult to work with so sometimes it makes more sense to assign a surrogate key either from an identity or sequence database object or adopt one from the source system table.

Lineage columns – These columns are used for auditing purposes. Typically in a data warehouse environment the ETL loads are run nightly. You may want to keep track of which loads are successful or failed and also the load date so you know how current your data is in the warehouse or even which source system provided the data. This is known as lineage data and is typically not exposed to the report consumer but is very helpful for the DBA or ETL developers.

Example: FactSalesQuota fact table

Thanks for reading,

SQL Canuck

Friday, 30 October 2015

Fact Table Column Types

No comments:

Post a Comment

Paul Thompson, MCITP, OCA