Home » Developer & Programmer » Data Integration » Data Mart
Data Mart [message #93133] Tue, 25 February 2003 11:39 Go to next message
Messages: 7
Registered: May 2002
Junior Member
I am trying to design a data mart for reporting on financial assesment for trainees. The report is based on assesment effective date. The trainee is allocated a sum of $9000 for a three quarterly payments for a year. But if during the term the finacial circumstances of trainee change and new assesment is carried out and he is given extra money for the next two terms, then we end up with two records for the same person and the assesment effective date remains unchanged as assesment is carried from the start of academic year. One for the orignal assesment and one with new assesment. How can I design a data mart where duplicates do not exist and I can still query by assesment effective date.
Re: Data Mart [message #93134 is a reply to message #93133] Thu, 27 February 2003 00:39 Go to previous message
Suresh Raja
Messages: 4
Registered: February 2003
Junior Member
I guess you would a dimension as Trainee, If so as per my understanding from your query this can be treated as Changing / Dynamic dimension.
the best way to handel this is to issue another trainee record. Just gothru the below given example.
Suppose you work in a manufacturing company and one of your main data warehouse schemas is the company's shipments. The product dimension is one of the most important dimensions in this dimensional schema. (See Figure 1.) A typical product dimension in a shipments schema would have several thousand detailed records, each representing a distinguishable product capable of being shipped. A good product dimension table would have at least 50 attributes describing the products, including hierarchical attributes such as brand and category, as well as nonhierarchical attributes such as flavor and package type. An important attribute provided by manufacturing operations is the SKU number assigned to the product. You should start by using the SKU number as the key to the product dimension table.
Suppose that manufacturing operations makes a slight change in packaging of SKU #38, and the packaging description changes from "glued box" to "pasted box." Along with this change, manufacturing operations decides not to change the SKU number of the product, or the bar code (UPC) that is printed on the box. If the data warehouse team decides to track this change, the best way to do this is to issue another product record, as if the pasted box version were a brand new
product. The only difference between the two product records is the packaging description. Even the SKU numbers are the same. The only way you can issue another record is if you generalize the key to the product dimension table to be something more than the SKU number. A simple technique is to use the SKU number plus two or three version digits. Thus the first instance of the product key
for a given SKU might be SKU# + 01. When, and if, another version is needed, it becomes SKU#+ 02, and so on. Notice that you should probably also park the SKU number in a separate dimension attribute (field) because you never want an application to be parsing the key to extract the underlying SKU number.

This method for tracking changing dimensions is very powerful because new dimension records automatically partition history in the fact table. The old version of the dimension record points to all history in the fact table prior to the change. The new version of the dimension record points to all history after the change. There is no need for a timestamp in the product table to record the change. In fact, a timestamp in the dimension record may be meaningless because the event of interest is the actual use of the new product type in a shipment. This is best recorded by a fact table record with the correct new product key.

Another advantage of this technique is that you can gracefully track as many changes to a dimensional item as you wish. Each change generates a new dimension record, and each record partitions history perfectly. The main drawbacks of the technique are the requirement to generalize the dimension key, and the growth of the dimension table itself.
Previous Topic: Difference between OLAP and DSS
Next Topic: Oralce 9i
Goto Forum:

Current Time: Sat May 18 10:51:18 CDT 2024