In this blog post we will dive into implementing a data vault with more than one source system with the help of WhereScape 3D and WhereScape RED. This blog post is a part of the series: The power of model conversion in data warehouse automation.
If you start to become a more and more proficient data vault modeler you will face the challenge of designing a data vault with more than one data source.
Primary data source
For this example we will use the Northwind database and the AdventureWorks database as they both contain an employee entity which will act as our example. Let’s first design the employee entity with the Northwind data source and take it from there.
Make sure to name the source mapping set according to the data source (Northwind).
Adding the second data source to the multi-source data vault model
First, add an additional source mapping set with the name of the second data source (AdventureWorks).
Now add the source entity from the AdventureWorks data source to this second source mapping.
All elements with a similar name in the target entity and the source entity will be mapped automatically to the existing model. Nevertheless make sure your target entity matches the data type, pay special attention to length and precision and expand the data type of your target entity where needed.
Now map all the attributes which do not match. In this example, the business key of the employee table in the AdventureWorks database is called EmployeeKey. In the Northwind database the business key of the employee table is called EmployeeID. Don’t forget to align the data type length/precision.
Now let’s add the attributes which only occur in the AdventureWorks data source. Therefore drag MiddleName, LoginID and EmailAddress to the „Add attribute“ placeholder. This action will add these attributes to our target entity.
What does the result look like?
After making sure to assign a attribute type to the added attributes, let’s have a look at what we have accomlished so far. Therefore generate the data vault and display the resulting diagram. The result should look somewhat similar to the following screenshot.
What do we see here? The Hub h_employees contains two source columns for the Employeeid business key. For every source mapping set we added, we get a separate satellite from the respective source. The number of attributes are different for each source. The next step will be to export this model to WhereScape RED to see everything in action.
In WhereScape RED the Hub h_employees has so-called source mappings assigned to it, which will collect data from both data sources. Awesome!
But what about composite keys? Have a look at the sequel of this post: Composite business keys in a data vault system.