- product prices changing over time
- people changing their name
- country/state names may change over time
A slowly changing dimension is generally categorised into one of three possible types; Type 1, Type 2 and Type 3. In the examples that follow, I explain each type.
This is our initial example data:
| Product ID (PK) | Year | Product Name | Product Price |
| 1 | 2004 | Product 1 | £150 |
Type 1: Overwriting the old values
In the year 2005, if the price of the product changes to £250, then the old values of the columns "Year" and "Product Price" have to be updated and replaced with the new values. In this type, there is no way to find out the old value of the product "Product 1" in the year 2004 since the table now contains only the new price and year information.
| Product ID (PK) | Year | Product Name | Product Price |
| 1 | 2005 | Product 1 | £250 |
Type 2: Creating another additional record
In this type, the old values will not be replaced but a new row containing the new values will be added to the product table. So at any point of time, the difference between the old values and the new values can be retrieved and easily be compared. This would be very useful for reporting purposes.
| Product ID (PK) | Year | Product Name | Product Price |
| 1 | 2004 | Product 1 | £150 |
| 1 | 2005 | Product 1 | £250 |
The problem with this approach is "Product ID" cannot store duplicate values of Product 1 since "Product ID" is the primary key. Also, the current data structure doesn't clearly specify the effective date and expiry date of Product 1 (that is, when the change to it's price happened). So, it would be better to change the current data structure to overcome the above primary key violation.
| Product ID (PK) | Effective DateTime (PK) | Year | Product Name | Product Price | Expiry DateTime |
| 1 | 01-01-2004 12.00 AM | 2004 | Product 1 | £150 | 12-31-2004 11.59 PM |
| 1 | 01-01-2005 12.00 AM | 2005 | Product 1 | £250 |
In the changed table structure, "Product ID" and "Effective DateTime" are composite primary keys. So there would be no violation of primary key constraint. Addition of new columns, "Effective DateTime" and "Expiry DateTime" provides the information about the product's effective date and expiry date which adds more clarity and enhances the scope of the table. This type 2 approach will require additional space in the database due to the additional record being stored, but this should not be an issue as dimensions are not that big in the real world.
Type 3: Creating new fields
In this type, the latest update to the changed values can be seen. New columns are added which allow us to keep track of the changes.
| Product ID (PK) | Current Year | Product Name | Current Product Price | Old Product Price | Old Year |
| 1 | 2005 | Product 1 | £250 | £150 | 2004 |
The problem with Type 3 is that the complete history will not be stored if there are several continuous changes over the years (it retains a limited history). For example, in 2006, if the price changes to £350, we would not be able to see the complete history of 2004 prices, since the old values would have been updated with the 2005 product information.
| Product ID (PK) | Current Year | Product Name | Current Product Price | Old Product Price | Old Year |
| 1 | 2006 | Product 1 | £350 | £250 | 2005 |
No comments:
Post a Comment