A concise comparison between two types of Inner Joins and where to use them.
Considering a real-world scenario, In any relational database, data is typically distributed in more than one table. In order to generate any useful data we often need to query data from multiple tables.
So, the first and most important thing to do while writing any SQL queries is to get a better idea of how the data is structured and stored. Once we understand how the data is stored among multiple tables, it will be much easier to write SQL queries to retrieve any kind of information from that data set.
In this tutorial, we’ll briefly explore how to combine data from two or more tables using Inner Join based on a relation between them and also have a better understanding of the different types of inner joins and when to use them effectively.
What is an Inner Join?
In a relational database, an Inner Join is simply the joining of multiple tables in which the resulting table will only contain those records that satisfy all the join conditions. We can perform Inner Join on any number of tables with a minimum of two.
There are 2 syntactical ways to perform Inner Join.
- Explicit Join
- Implicit Join
In this case, the
INNER JOIN / JOIN keyword is used to specify the tables to join, and the
ON keyword ****is used to specify the conditions for joining the tables. It is easier to understand and less prone to errors.
SELECT columns FROM table1 INNER JOIN table2 ON table1.column_name = table2.column_name;
In this case, all the joining tables are listed in the FROM clause (each table separated by a comma) and are later connected in the WHERE clause. It basically performs a CROSS JOIN based on some conditions. Even though it is difficult to understand and more prone to errors, it is the most common way to connect two tables.
SELECT columns FROM table1, table2 WHERE table1.column_name=table2.column_name;
In the join clause, it is not necessary that we always use the “
=" sign. It can be >=, <=, < or !=.
We will discuss more on this in the coming blogs.
Analysis of Explicit and Implicit Inner Joins
Let's jump into the analysis of the Explicit and ImplicitINNER JOINs with the help of some practical examples.
Firstly we need a sample database, for this, we are going to use Microsoft’s sample database named Northwind.
This database contains the sales data for a fictional store called Northwind Traders, which imports and exports specialty foods from around the world. You can get the sample database from here.
💡 Note: The SQL code mentioned on this entire blog is written with PostgreSQL dialect in mind.
Assuming that you have already dumped the Northwind Database to your PostgreSQL's local database, If not don't worry the below code shows how to dump the data to your existing empty database
$: psql -d your_database_name -f northwind_database.sql
Let’s list out the tables in the database. we can use the below query to get a more formatted output.
SELECT table_name FROM information_schema.tables WHERE table_schema='public' ORDER BY table_name;
💡 Alternatively, you can also use the command
\\dt to list the databases,
You should get an output similar to this.
Performing INNER JOIN with three tables
Let’s consider the following tables (categories, suppliers, products).
Table products ( Displaying only first 10 records )
Let's Consider a Sample Exercise and generate the output with both Explicit Inner join and with Implicit Inner Join and analyze.
We need to write an SQL query to display the id, name, category name, and supplier name of all the products that belong to the categories Beverages and Sea Food.
Performing Explicit Inner Join Here to achieve the above scenario, we need to join the three tables’ categories, products, and suppliers with two INNER JOINS.
SELECT p.product_id, p.product_name, c.category_name, s.company_name FROM products AS p INNER JOIN categories AS c ON p.category_id=c.category_id INNER JOIN suppliers AS s ON p.supplier_id=s.supplier_id WHERE c.category_name in ('Beverages', 'Seafood') ORDER BY p.product_id;
Here, the above SQL query does two specific actions listed below:
- joins ****the products table and categories table based on category_id
- and joins the products table and suppliers table based on supplier_id
The command returns only those records where there is a match between column values in both the join conditions and also the conditions given in the where clause.
Performing Implicit Inner Join
Here to achieve the above scenario, we need to list the three tables and add the join and filter conditions inside the WHERE clause.
SELECT p.product_id, p.product_name, c.category_name, s.company_name FROM products p, categories c, suppliers s WHERE p.category_id=c.category_id AND p.supplier_id=s.supplier_id AND c.category_name in ('Beverages', 'Seafood') ORDER BY p.product_id;
Here, the above SQL query does the following:
- All the join conditions and the conditions to filter products based on categories are given in ****the WHERE clause, where each condition is separated by an AND operator.
TheOutput is the same data that we saw in the explicit query output.
We have replaced the following explicit query segment
FROM products AS p INNER JOIN categories AS c ON p.category_id=c.category_id INNER JOIN suppliers AS s ON p.supplier_id=s.supplier_id WHERE c.category_name in ('Beverages', 'Seafood')
with the implicit query segment as given below.
FROM products p, categories c, suppliers s WHERE p.category_id=c.category_id AND p.supplier_id=s.supplier_id AND c.category_name in ('Beverages', 'Seafood')
From the above analysis of the sample exercise, we were able to see the different ways that we could query the data using the two types of Inner joins.
We were also able to understand that,
Explicit join queries are easier to understand and less prone to errors, they are easy to debug but their query size will be larger as compared to implicit join queries. They are more suitable for situations where we use complex SQL queries with very less frequent access and for those people who are not much familiar with the database tables and their relations.
On the other hand, the implicit join queries are difficult to understand and more prone to errors, they are difficult to debug but their query size will be smaller as compared to explicit join queries. They are more suitable for those who frequently access databases having a clear idea about the table relations and also for those who use SQL queries in their daily work. These people can write complex and concise implicit join queries very easily.
Thanks for reading!