Composite Business Keys in a Data Vault System

Ensuring the uniqueness of a business key in a data vault is absolutely paramount for a data vault modeler. Especially when using multiple data sources there are some tasks to be done to make the business key of an entity unique. Thats where composite keys come into play.

This blog post is a continuation of Designing a multi-source data vault model. It is a part of the blog series The power of model conversion in data warehouse automation.

The solution presented in this post was realized with WhereScape 3D and WhereScape RED on a Microsoft SQL server database.

What’s the problem?

If you are designing a multi-source data vault system you have to make sure your business keys are unique.

Have a look at the following picture. The business keys in the two source systems are both integers. The chances are very high, the same number belongs to another business object (employee in this case).

The two data sources show different persons for the same id.
Screenshot: the employee entities in two data sources

You need to add a separatory element to single out the data sources, whenever you are dealing with an entity with more than one data source. If this is not taken care of, you will not be able to combine the data in a correct way.

In our previous example, we created a hub with two source mappings, one for each data source. But it turns out the business key is not unique over both these two source systems!

The employee with ID 1 in the Northwind data source is called Nancy Divalio. The employee with ID 1 in the AdventureWorks data source is called Guy Gilbert. They are definitely not the same person!

Implement composite keys on hubs

Now let’s come up with a solution to this problem. Let’s extend our export views on both data sources and add the data source to them.

ALTER VIEW [dbo].[export_Employees_90]
AS
SELECT
      [EmployeeID]
      , CAST('Northwind' as nvarchar(50)) as EmployeeSrc
      ,[LastName]
      ,[FirstName]
      ,[Title]
      ,[TitleOfCourtesy]
      ,[BirthDate]
      ,[HireDate]
      ,[Address]
      ,[City]
      ,[Region]
      ,[PostalCode]
      ,[Country]
      ,[HomePhone]
  FROM [dbo].[Employees]

After changing the views discover the changes and merge them into the current version of the data source.

Discover the data source and merge the changes into the current version of the data source.
Screenshot: discover changes and merge into current source

Now we can extend our data vault model to incorporate the newly created column EmployeeSrc. First add the attribute for the Northwind data source, then wire up EmployeeSrc in the second source mapping set.

add the new attribute EmployeeSrc to the Employees entity.
Screenshot: add attribute

Next assign the attribute type business key to the new attribute EmployeeSrc. This will lead to an extension of the hub hash key of the Employee hub.

Screenshot: assign business key to EmployeeSrc

This effect can be made visible on the stage table. Therefore let’s generate the data vault and generate the load and stage tables for this model. The stage tables contain an extended property displaying the source for the resulting hub hash key now.

Screenshot: Hub hash key on stage table

Now let’s have a look at the resulting hub h_employees. We expect an additional column EmployeeSrc and we expect a different hub hash key, even if the business key is the same.

Screenshot: resulting hub and its hub hash key

We have a composite key for the hub h_employees now, nice! But we are not done yet, as hubs in a data vault need to be linked to one another. This can be achieved through links. Let’s see now how to implement composite keys on links in the next chapter.

Implement composite keys on links

Firstly, we have to create a link entity. To keep it simple, we will directly use the employee entity from the Northwind data source. The original table contains a column ReportsTo. We will incorporate this in a view.

CREATE VIEW [dbo].[export_Emp2Mgr_90]
AS
SELECT [EmployeeID]
	  , CAST('Northwind' as nvarchar(50)) as EmployeeSrc
      ,[ReportsTo]
	  , CAST('Northwind' as nvarchar(50)) as EmployeeSrcMgr
  FROM [Northwind].[dbo].[Employees]

Next, let’s model the data vault. Assign the attribute type link business key to EmployeeId, EmployeeSrc, ReportsTo, EmployeeSrcMgr. Then wire up the columns EmployeeID and EmployeeSrc to the Employees entity. When prompted for the relationship, add the FK constraint name Emp.
Next wire up the columns ReportsTo and EmployeeSrcMgr to the Employees entity. When prompted for the relationship, add the FK constraint name Mgr.

Screenshot: assign a FK constraint name to the relationship

To check the relationship, consult the relationship screen of the EmpToManager entity.

Screenshot: check the realationships
Screenshot: Resulting model

Looks good! Lastly let’s export this model to WhereScape RED and see it in effect.

Screenshot: Result after loading the data

That’s it. Our model allows to display the relationship between employees and their managers in the data vault. We created a multi-source data vault and we incorporated composite keys in hubs and links into our system in a breeze.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert