Technical definition of Role Playing Dimensions, my own and not from any purists, “If a dimension is joined to a particular fact table in more than one ways, by nature of design it is a Role Playing dimension”. Or defining in an OOPs way, if a relationship of dimension with fact table or a measure group is polymorphic in nature then we can call that dimension a “Role Playing Dimension”.
And as per “Ralph Kimball” team, Role Playing Dimensions “The situation where a single physical dimension table appears several times in a single fact table. Each of the dimension roles is represented as a separate logical table with unique column names through views.” from “The Data Warehouse Toolkit, The Complete Guide to Dimensional Modeling, 2nd Edition”.
An easily understood analogy is same person acting as Son, Brother, Husband, Father. Underlying person is same but interactions with different people vary. Similarly a Role Playing Dimension, underlying Dimension is same but would have different logical representations when joined with fact tables. There are many use cases for Role Playing Dimensions. SQL Server Analysis Services supports Role Playing Dimensions out of Box but with some limitations which we would detail at end of post.
Date Dimension as a Role Playing
In any transaction there would be multiple days, take for instance Retail outlets like EBay or Amazon. Some of dates could be
Order Entry Date, Accept Order Date, Procure Date, Assemble Package Date, Ship Date, Payment Date blah blah… But underlying all of these are dates. One thing that needs to be ensured is granularity of Date Dimension. If all dates dimensions are of same granularity then one Single Date Dimension could be used to answer all dates. If dates are at different granularity then it would warrant a different Date Dimension. An couple of points to note here is:
1. If single underlying table is joined to fact table, SQL assumes there are multiple joins between tables. But in business it would be very rare that an Oder Date, Accept Order Date, Procurement Date, Ship Date etc are all one and same date. Implying each of these join to different Date Keys. And in such cases assuming that all of relations between Fact and Dimensions are equal would lead to data inconsistencies (Data may not come at all)
2. If in reporting, each date column should have a different name for analytical purposes. If a generic name is used like “Date” it leads to ambiguity in terms of what date such a column would reflect.
For administrative purposes it is recommended to have core table and for each of logical relationships it is recommended to create a view with different column names. Each view would then be treated as independent Date / Time dimension even though underlying Date / Time dimensions are only Single tables.
Date alone is not a single example of Date Dimension but Role Playing dimensions could be any dimensions where relationship between dimension and facts are multi roles. Or in modeling terms if an actor (UML terminology) acts in multiple roles in a transaction then it becomes a potential for Role Playing Dimension.
Trading Industries (Where a Trader can be either a buyer or both not in a single transaction but across transactions). In these cases for administrative purposes it would not makes sense to have two trader tables.
Different administrative problems could be
If two trader tables are created (Seller and Buyer), data redundancy and data needs to be updated in both tables (Insert, Update, Delete nightmares),
If dimensions are large database sizes would increase due to clone of tables.
Any backup jobs, maintenance jobs need to be performed for both both tables.
Different Development problems could be:
If two trader tables are created (Seller and Buyer), multiple ETL packages need to be developed. Development effort is unnecessarily increased though increase may not be linear.
In Reporting Layer, as columns names are same for both roles, a mechanism should be put in place to distinct columns of both roles.
To avoid such problems recommendation is to create a single table and create logical views on top table to be introduced to Data Source View of SSAS or Universe of competing platforms (Cognos, BO etc).
In Telecom industry again we can see Role Playing Dimension in terms of Called and Caller. Caller calls Called for one call and role reversal may happen for next call. Again both Called and Caller are created in one core dimension with 2 distinct Views created on top for Caller and Called with column names different. When clubbing two such roles and creating a single dimension, attribute in core dimension should be a union of attributes for each individual role. And view created should only select columns attributed to each role. If columns are same across both roles, ensure proper naming conventions are followed to distinctly identify each role. Till now we have seen cases where a Dimension plays different role with respect to Fact.
Roles with Dimensions in Snowflake:
In some situations a specific dimension could be linked to multiple dimension across data mart. For example take Address or more generically Location. This though may not directly linked to a fact but is used with multiple unrelated Dimension. Such dimensions are Location dimension may be used with Customer, Employee. Customer and Employee Dimensions are snow flaked to include Location Dimension. A copy of location dimension could be created (Views) and each independent dimensions should be linked with location dimension. Below is an example of schema with Location dimension linked to multiple independent dimensions.
Analysis Services is “Role Playing Dimension Aware”, ie. a single Time Dimension can be joined to same fact (measure group) but different fact attributes. And when browsing cube one can see single Time Dimension as different dimensions based on number relations between Date Dimension and Fact Table. Say for example our cube has structure as below
And if cube is browsed we would see three dates (highlighted) in blue. Though there was only 1 DimTime, it has assumed three polymorphic forms “Order Date”, “Ship Date”, “Delivery Date”, though underlying Date Dimension is Single. This transformation is done by SQL Server Analysis Services itself.
“Order Date”, “Ship Date”, “Delivery Date” names are customizable and in below screenshot it is detailed how renaming dimensions (logical though physically they are single dimensions) are done. In Cube structure tab, go to cube dimension and click on Properties. For example click on Delivery Date and go to properties. There in properties screen one can define polymorphic name of time Dimension by modifying Name Property.
So out of box analysis services is aware of Role Playing dimension but there are some gotchas that one need to be aware of. That would be covered in a different post.
Is that not so simple and obvious, well, seemingly I had a chance to look at and in process learn that in some places Multiple dimensions (like time) are created which may not always be best practice.
Again I am not saying go ahead and always create only 1 dimension,
If there is a business case, to create many dimensions of similar in nature do so but I feel atleast there should be a business case for it.
Typically if there dimensions with multiple granularity then we need to have in different dimension (similar to snowflakes) and different fact.
Until next time.
-Consult Guru Group