POWER BI - Modeling

Joining Tables

The Basics:

When modeling in Power BI we don't need to have just one big "flat" table with all our data. In Power BI, it is very easy to create relationships between tables, and Power BI will automatically try to detect relationships between tables if any exists. Though it's best to check the relationships or create them yourself so that you understand which columns are connecting two tables together and that you don't have any loops. See image below showing the adventure works sample database:

Simple example of joining tables together in Power BI.

This is under the "Relationships View" in Power BI. To get here click the icon the red arrow is pointing to. This model is referred to as a star-like data model. We have Our Sales Table right in the middle. The Sales table is connected to the Store table, the Customer table, the Product and the Date table. The reason it is called a star is because you have these branches (other tables) that come out from a central table (in this case the Sales table). It is good practice to have your data arranged this way to keep thing more manageable and organized.

To connect these tables together, we simply take the common columns in each table and join them together by selecting, dragging and dropping (similar to playing connect the dots).

The productKey column in the Product table is connected to the productKey column in the Sales Table. Similarly, the Sales table and Date table are being joined together by the orderdateKey and the datekey columns. Columns that join tables together in this way are generally referred to as foreign keys and primary keys respectively.

Notice that the tables are joined by a * on one side and a 1 on the other. This is called a many to one relationship. For instance, in the product table we have 2517 unique products and 2517 rows (all identified by a specific productkey). In the sales table, we have over 12.5 million rows. In the sales table, we also have a product Key column but it is not unique. (i.e we have multiple rows with the same productKey). This is what is meant by a many to one relationship. The product Key column in the Productstable is called the primary key, while the product key column in the Sales table is referred to as the foreign key.

An Example:
More realistic example of relationships between databases.

In this case the Sales table is still the center dataset, but other tables further join additional tables. Notice the three tables:

  • Product Category
  • Product Subcategory
  • Product

Together they form what is called a hierarchy. Each product category has several or sometimes many subcategories. In turn, each subcategory has many products. It is the Product table that is being joined with the Sales table since a customer will be purchasing specific products. Having data structured this way makes things much more organized, and Power BI can readily exploit data structured in this way with it's powerful data drill feature.

Up Next: Creating calculated columns and measures.

Dash-Intel is a Power BI and Tableau resource site for data visualization and building BI dashboards.

Data Analyst & Consultant

Copyright 2015-2023 Dash-Intel.com Terms