Thomas.I Microsoft BI & Analytics

A Site Dedicated to General BI Issues and MS BI Issues: Architecture and Technology

Slowly changing dimensions or avoid that problem?

Posted by thomasivarssonmalmo on April 24, 2007

This is not my idea but actually is in a chapter in Kimballs "Data Warehouse Life Cycle Toolkit". It was also Tim Peterson that make me aware bout this solution in his book: Microsoft SQL Server 2000 Data Transformation Services.
 
And this design is relevant for all SSAS2005 designers. If someone make a mistake in the ETL-processs or in the design of the data warehouse, you will pay for this in the cubes.
 
Slowly changing dimensions is about how you handle historical changes in you data warehouse and your SSAS2005 cubes.
  • Type one: No history, overwrite all attributes in the dimension, except the technical key, with the latest changes each time you update your data warehouse.
  • Type two: For attributes/columns that are important for keeping historical changes you simply add a new version of an existing customer or product. The old version and the new version have time stamps that tells if it valid or not.
  • Type three: Add a new attribute/column for an attribute that is important for historical changes.

Why is history important? If you have a key account or salesperson that change district that person/key account will also change sales history for districts. Salespeople and districts are in the same dimension. Sales can go up or down for districts depending on that salespeople change district. With type one you will face this problem but with type two you will add a new version of the same salesperson with a new technical key (surrogate key) and time stamps that tell when this combination of salesperson and district was relevant.

The problem with type two is that you data and cubes will be fragmented over time and it will be harder to follow each salespersons performance.  

Is there a better approach? For history,yes. Simply never have attributes/columns in the same dimension if there is a dynamic relationsship over time between the attributes. Make a Salesperson or key account separate dimensions and also one separate district dimension. Let them have their own foreign keys with the fact table. The downside of this approach is that you will not to be able to drill-down from district to salesperson. And you will not have autoexist between the attributes in the same dimension(Check Books On Line about autoexist).

SSIS have a SCD(slowly changing dimension wizard) that can be interesting to test but avoid it in production. It is as bad as the old data driven query task in DTS2000 because it use row by row inserts. This means that it is slow and useless except for education.

Advertisements

2 Responses to “Slowly changing dimensions or avoid that problem?”

  1. Tim said

    Thank you for referencing my book! I gave a presentation on this topic at the PASS 2002 conference in Denver. I’ve still got that presentation available on my web site – http://www.cubeslice.com/pass.htm
     
    I’m surprised that the 3 types of Slowly Changing Dimensions are still the industry standard way of talking about this problem. None of the three are good solutions – 1. Throw away history. 2. Destroy continuity across time. 3. Too impractical to really implement.
     
    Anyway – I still think it’s an important topic to consider – and I’m glad to see there’s still someone who agrees with me.
     
    Thanks for your blog. I just started one myself – biperformance.spaces.live.com
     
    Tim Peterson
     

  2. Thomas said

    Good comment Tim. I will use your description of Type 1, 2 and 3.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: