Mastering SQL Subqueries

Anton Haugen
3 min readDec 21, 2020

--

As part of my post-graduation effort, I have been reviewing key concepts programming languages relevant to Data Science. Lately, I’ve been brushing up on SQL by taking Stephen Grinder’s SQL and PostgreSQL: The Complete Developer’s Guide on Udemy to integrate SQL queries and database management in the Flask web app I made called urStory, a recommender for public domain short stories based around word vectorization.

One key concept to understanding relational databases is how tables can relate to one another. There are three primary relationships: One-To-Many, Many-To-One, and Many-to-Many. Take a photo-sharing app like Instagram; one user can have many photographs, many comments can belong to one photograph, and many followers can follow many accounts who in turn can follow many other accounts.

However, the thing that made me realize just how powerful SQL queries can be were subqueries. A subquery is a query nested inside of another query, allowing one to query the results of another query or querying the join of two other queries. Querying the results of another query are so flexible that they can also be helpful if one forgets a more optimal query function, but because of this flexibility, subqueries can feel somewhat mindboggling, so getting a handle of their rules can help you in your SQL journey.

To start off, a subquery can occur in one of three places in your query, in the SELECT portion, the FROM and JOIN portion, or the WHERE portion. However, each of these portions has a distinct set of rules for the shape of data the subquery can return.

A subquery in the SELECT portion must return a single value, like in the query SELECT COUNT(*) FROM table. A subquery in the FROM and JOIN portion can return a source of rows while a subquery in the WHERE portion can return up to one column of data with many rows, like in the query SELECT id FROM table. Also, the syntax for an embedded subquery requires parentheses around the subquery.

One of the coolest things you can do with subqueries is make your query have something resembling the kind of For-loops typical of Python or JavaScript! I will explain using Stephen Grinder’s example. Imagine you have a database for an ecommerce website, and you are trying to find the most expensive product in each department. To return this in a single query, you can make use of the looping functionality of subqueries.

First you would want to query the products table to return the name, department, and price:

SELECT name, department, price
FROM products AS p1;

You’ll notice we created an alias for products. This is because we will be referring to the products table twice in order to find the product where the price is equal to the max price of each department.

Returning to our query we’re going to add a WHERE clause where we’ll have our subquery.

SELECT name, department, price
FROM products AS p1
WHERE p1.price = ( [subquery]);

Remember, our subquery in the WHERE portion can be up to a column of values. However because of the equal operator, we want this subquery to return a single value. So let’s draft our inner query, where products will be aliased as p2:

SELECT MAX(price)
FROM products AS p2

Notice that the subquery does not have a semi-colon afterwards, as semi-colons are only for the main query.

Now, we will be adding the WHERE clause that will allow our query to behave like a FOR-loop.

SELECT MAX(price)
FROM products AS p2
WHERE p2.department = p1.department

The inner query now refers to a value found in the outer query, which allows it to iterate over each value in the table to match the department of the p1 table, which I think is really cool. The finalized query to find the max price of each department using sub-queries would look like this:

SELECT name, department, price
FROM products AS p1
WHERE p1.price = (
SELECT MAX(price)
FROM products AS p2
WHERE p2.department = p1.department
);

I hope this was helpful to you, and I highly recommend checking out Stephen Grinder’s course on Udemy for a very thorough explanation of many SQL concepts. And stay tuned here for more about SQL!

--

--