Dimensional Modeling Concepts
I have been working for my current employer for nearly 10
years now. I started out as a report
developer but always intended on climbing the ladder as quickly as I could to
get promoted into a more senior technical role.
The opportunity to work as a Senior Data Warehouse architect
came to me but first I had to pass a technical test in order to be
considered. Long story short, I passed
the test and got the job.
Recently, a new report developer mentioned to me that they
wanted to eventually do the same thing.
My biggest piece of advice was to start studying on data warehouse
concepts now so when the opportunity comes up they are ready.
An important concept to comprehend to be a data warehouse
architect is to understand the concepts of dimensional modelling.
Any student of any reputable computer science or CIS program
learns all about the relational modelling but very few seem to graduate from
school and know much about dimensional modelling.
It is the purpose of this blog entry to touch on the very
broad main concepts of dimensional modelling.
In a nut shell, the main purpose of dimensional modelling is
provide users with a database model that is optimal for reading and summarizing
data in order to come to sound decisions.
This is achieved by creating a database model that resembles
a star shape with a fact table in the middle and more or more dimension tables
that surround the fact table. This is
called a ‘Star Schema’.
Once a business purpose is defined for the star schema, then
the grain of the fact table is established.
For example, a grain for a business purpose could be the
cost of each individual items that a surgeon uses to perform a given
surgery. From this grain, the total cost
of each surgery could easily be calculated as could the cost of all the
surgeries that a given surgeon performs or even the total cost of all the
surgeries that all the surgeons perform for a given medical procedure. Now you can start to see where a star schema
can become very powerful.
The main benefits to modelling the data in this way is that
reading the data from a star schema is much faster than a relational model and
often when the data is structured in this way it becomes much easier for the
consumer to understand the nature of the structure compared to complex
relational models .
The star schema for a cost of surgery situation could
resemble something like this.
This provides a very simple to understand schema that is
optimal to read from because of the simplicity of the relationships and
joins. This isn’t as efficient as 3rd
normal form relational modeling in some aspects but those aspects aren’t considered
as important in a OLAP environment where reading from a data structure is
considered more vital than simply storing data or entering data as in a OLTP
environment.
Thanks for reading,
MSSQL Canuck
MSSQL Canuck
No comments:
Post a Comment