Ensuring the uniqueness of a business key in a data vault is absolutely paramount for a data vault modeler. Especially when using multiple data sources there are some tasks to be done to make the business key of an entity unique. Thats where composite keys come into play.
This blog post is a continuation of Designing a multi-source data vault model. It is a part of the blog series The power of model conversion in data warehouse automation.
What’s the problem?
If you are designing a multi-source data vault system you have to make sure your business keys are unique.
Have a look at the following picture. The business keys in the two source systems are both integers. The chances are very high, the same number belongs to another business object (employee in this case).
You need to add a separatory element to single out the data sources, whenever you are dealing with an entity with more than one data source. If this is not taken care of, you will not be able to combine the data in a correct way.
In our previous example, we created a hub with two source mappings, one for each data source. But it turns out the business key is not unique over both these two source systems!
The employee with ID 1 in the Northwind data source is called Nancy Divalio. The employee with ID 1 in the AdventureWorks data source is called Guy Gilbert. They are definitely not the same person!
Implement composite keys on hubs
Now let’s come up with a solution to this problem. Let’s extend our export views on both data sources and add the data source to them.
ALTER VIEW [dbo].[export_Employees_90] AS SELECT [EmployeeID] , CAST('Northwind' as nvarchar(50)) as EmployeeSrc ,[LastName] ,[FirstName] ,[Title] ,[TitleOfCourtesy] ,[BirthDate] ,[HireDate] ,[Address] ,[City] ,[Region] ,[PostalCode] ,[Country] ,[HomePhone] FROM [dbo].[Employees]
After changing the views discover the changes and merge them into the current version of the data source.
Now we can extend our data vault model to incorporate the newly created column EmployeeSrc. First add the attribute for the Northwind data source, then wire up EmployeeSrc in the second source mapping set.
Next assign the attribute type business key to the new attribute EmployeeSrc. This will lead to an extension of the hub hash key of the Employee hub.
This effect can be made visible on the stage table. Therefore let’s generate the data vault and generate the load and stage tables for this model. The stage tables contain an extended property displaying the source for the resulting hub hash key now.
Now let’s have a look at the resulting hub h_employees. We expect an additional column EmployeeSrc and we expect a different hub hash key, even if the business key is the same.
We have a composite key for the hub h_employees now, nice! But we are not done yet, as hubs in a data vault need to be linked to one another. This can be achieved through links. Let’s see now how to implement composite keys on links in the next chapter.
Implement composite keys on links
Firstly, we have to create a link entity. To keep it simple, we will directly use the employee entity from the Northwind data source. The original table contains a column ReportsTo. We will incorporate this in a view.
CREATE VIEW [dbo].[export_Emp2Mgr_90] AS SELECT [EmployeeID] , CAST('Northwind' as nvarchar(50)) as EmployeeSrc ,[ReportsTo] , CAST('Northwind' as nvarchar(50)) as EmployeeSrcMgr FROM [Northwind].[dbo].[Employees]
Next, let’s model the data vault. Assign the attribute type link business key to EmployeeId, EmployeeSrc, ReportsTo, EmployeeSrcMgr. Then wire up the columns EmployeeID and EmployeeSrc to the Employees entity. When prompted for the relationship, add the FK constraint name Emp.
Next wire up the columns ReportsTo and EmployeeSrcMgr to the Employees entity. When prompted for the relationship, add the FK constraint name Mgr.
To check the relationship, consult the relationship screen of the EmpToManager entity.
Looks good! Lastly let’s export this model to WhereScape RED and see it in effect.
That’s it. Our model allows to display the relationship between employees and their managers in the data vault. We created a multi-source data vault and we incorporated composite keys in hubs and links into our system in a breeze.