SQL Interview Questions
SQL interviews test whether you can express data questions correctly and efficiently. Expect to write queries live, often involving joins, aggregation, and window functions.
What SQL interviews cover
Joins & filtering
INNER/LEFT/RIGHT/FULL joins, WHERE vs HAVING, and self-joins.
Aggregation
GROUP BY, aggregate functions, and grouping pitfalls (NULLs, duplicates).
Window functions
ROW_NUMBER, RANK, LAG/LEAD, running totals, and per-group top-N.
Performance
Indexes, query plans, sargability, and why a query is slow.
Sample SQL interview questions
- Find the second-highest salary in a table.What a strong answer covers
- Using MAX with subquery
- LIMIT/OFFSET with DISTINCT
- Window function DENSE_RANK() for ties
- Common pitfall: off-by-one and ignoring duplicates
View a sample answer
To find the second-highest salary, one common method is to use a subquery: SELECT MAX(salary) FROM employee WHERE salary < (SELECT MAX(salary) FROM employee). This handles duplicates because MAX returns a single value. Another approach is to use LIMIT/OFFSET: SELECT DISTINCT salary FROM employee ORDER BY salary DESC LIMIT 1 OFFSET 1; note that DISTINCT is necessary to avoid ties. A more robust solution uses window functions: SELECT salary FROM (SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk FROM employee) WHERE rnk = 2. DENSE_RANK() ensures that if there are multiple salaries tied for first, the second-highest distinct salary is still rank 2. A common pitfall is using LIMIT 1 OFFSET 1 without DISTINCT, which may return a duplicate of the highest if there are ties.
Reference solutionsql -- Using subquery SELECT MAX(salary) AS second_highest FROM employee WHERE salary < (SELECT MAX(salary) FROM employee); -- Using LIMIT/OFFSET (MySQL) SELECT DISTINCT salary FROM employee ORDER BY salary DESC LIMIT 1 OFFSET 1; -- Using DENSE_RANK() SELECT salary FROM ( SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk FROM employee ) ranked WHERE rnk = 2; - Get the top N rows per group (e.g. top 3 products per category).What a strong answer covers
- Window function ROW_NUMBER() or RANK()
- Partition by group, order by metric
- Filter with CTE or subquery
- Alternative: correlated subquery with EXISTS
- Common pitfall: handling ties incorrectly
View a sample answer
The most efficient way is to use a window function such as ROW_NUMBER(). For example, to get the top 3 products per category by sales: WITH ranked AS (SELECT product_id, category, sales, ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS rn FROM products) SELECT * FROM ranked WHERE rn <= 3. ROW_NUMBER() assigns unique ranks even with ties, which is often desired. If you want to include ties, use RANK() or DENSE_RANK(). An alternative is a correlated subquery: SELECT * FROM products p WHERE (SELECT COUNT(*) FROM products p2 WHERE p2.category = p.category AND p2.sales > p.sales) < 3. This is less efficient for large datasets. A common pitfall is not partitioning correctly or ignoring NULLs in ordering.
Reference solutionsql -- Using ROW_NUMBER() (recommended) WITH ranked AS ( SELECT product_id, category, sales, ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales DESC) AS rn FROM products ) SELECT * FROM ranked WHERE rn <= 3; -- Using correlated subquery (alternative) SELECT * FROM products p WHERE ( SELECT COUNT(*) FROM products p2 WHERE p2.category = p.category AND p2.sales > p.sales ) < 3; - Explain the difference between WHERE and HAVING.What a strong answer covers
- WHERE filters rows before aggregation
- HAVING filters after aggregation
- WHERE cannot use aggregate functions
- HAVING requires GROUP BY (though not always strictly)
- Common pitfall: using HAVING without GROUP BY
View a sample answer
The WHERE clause filters rows before any grouping or aggregation occurs. It is applied to individual rows and cannot contain aggregate functions like COUNT(), SUM(), etc. The HAVING clause is used after aggregation to filter groups, typically in conjunction with GROUP BY. HAVING can use aggregate functions and column aliases from the SELECT list (if allowed by the database). For example, you can write 'SELECT department, COUNT(*) FROM employees GROUP BY department HAVING COUNT(*) > 5'. A WHERE clause here would be incorrect because COUNT(*) is not available before grouping. A common mistake is using HAVING without GROUP BY, which is valid in some databases but treats the whole result as a single group. Another pitfall is mixing conditions: put non‐aggregate conditions in WHERE for efficiency.
- Write a query to find duplicate rows and keep only one.What a strong answer covers
- Use GROUP BY and HAVING COUNT(*) > 1
- Keep one row using MIN(id) or ROW_NUMBER()
- DELETE using a subquery
- Common pitfall: not defining duplicate criteria properly
View a sample answer
To find duplicate rows, group by the columns that define uniqueness and use HAVING COUNT(*) > 1. To keep only one occurrence, you can retain the row with the smallest ID (or any unique column). For example, to find duplicate emails: SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1. To delete duplicates and keep one, use a self-join or a subquery: DELETE FROM users WHERE id NOT IN (SELECT MIN(id) FROM users GROUP BY email). Alternatively, use ROW_NUMBER(): WITH cte AS (SELECT id, ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn FROM users) DELETE FROM users WHERE id IN (SELECT id FROM cte WHERE rn > 1). A common pitfall is assuming a single column defines uniqueness when multiple columns may be needed.
Reference solutionsql -- Find duplicates SELECT email, COUNT(*) AS cnt FROM users GROUP BY email HAVING COUNT(*) > 1; -- Keep one (lowest id) DELETE FROM users WHERE id NOT IN (SELECT MIN(id) FROM users GROUP BY email); -- Alternative using ROW_NUMBER() WITH cte AS ( SELECT id, ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) AS rn FROM users ) DELETE FROM users WHERE id IN (SELECT id FROM cte WHERE rn > 1); - Compute a running total of daily revenue.What a strong answer covers
- Window function SUM() OVER (ORDER BY date)
- Correct ordering and partitioning if needed
- Alternative: correlated subquery or self-join
- Common pitfall: incorrect ordering or missing partition
View a sample answer
The most efficient way to compute a running total is using the SUM window function with an ORDER BY clause. For daily revenue, assuming a table with one row per day (or you may need to aggregate first), you can write: SELECT date, revenue, SUM(revenue) OVER (ORDER BY date) AS running_total FROM daily_revenue. If there are multiple entries per day, you must aggregate by date first: SELECT date, SUM(revenue) AS daily_rev, SUM(SUM(revenue)) OVER (ORDER BY date) AS running_total FROM revenue GROUP BY date ORDER BY date. A slower alternative is a correlated subquery: SELECT date, revenue, (SELECT SUM(revenue) FROM daily_revenue d2 WHERE d2.date <= d.date) AS running_total FROM daily_revenue d. This is O(n^2) and should be avoided for large data. A common pitfall is omitting ORDER BY in the window function, which leads to undefined order.
Reference solutionsql -- Assuming one row per day SELECT date, revenue, SUM(revenue) OVER (ORDER BY date) AS running_total FROM daily_revenue; -- With multiple rows per day WITH daily AS ( SELECT date, SUM(revenue) AS daily_rev FROM revenue GROUP BY date ) SELECT date, daily_rev, SUM(daily_rev) OVER (ORDER BY date) AS running_total FROM daily ORDER BY date; -- Correlated subquery (less efficient) SELECT date, revenue, (SELECT SUM(revenue) FROM daily_revenue d2 WHERE d2.date <= d.date) AS running_total FROM daily_revenue d; - How would you speed up a slow query? What does the query plan tell you?What a strong answer covers
- Analyze query plan with EXPLAIN
- Look for sequential scans, high cost, missing indexes
- Add indexes on join/where/order by columns
- Rewrite query, use covering indexes
- Common pitfall: adding indexes without understanding workload
View a sample answer
To speed up a slow query, start by examining the execution plan using EXPLAIN (or EXPLAIN ANALYZE if available). The plan shows how the database executes the query: which indexes are used, join types, estimated row counts, and costs. Look for sequential scans on large tables, nested loop joins that could be hash joins, and operations with high cost. Typical solutions include adding indexes on columns used in WHERE, JOIN, and ORDER BY clauses. However, indexes come with write overhead, so they should be added thoughtfully. For complex queries, consider rewriting them to reduce subqueries or use window functions, or break them into temporary tables. Also ensure statistics are up‑to‑date. A common pitfall is adding indexes on every column without considering selectivity; a well-chosen composite index often helps more. The query plan also reveals if the optimizer misestimates cardinality, leading to poor join orders; you might need to adjust statistics or use query hints as a last resort.
How to prepare
- Practice writing queries by hand — many interviews use a shared editor with no autocomplete.
- Master window functions; they unlock a huge class of 'top-N per group' and running-total problems.
- Think about NULL behavior and duplicates explicitly — they're common trick areas.
- Be ready to read a query plan and explain why an index helps.
Frequently asked questions
Which SQL dialect should I study?
Standard SQL covers most interviews. Know that window functions and CTEs are widely supported; some functions differ between PostgreSQL, MySQL, and SQL Server.
Do I need to optimize queries in a SQL interview?
For data/backend roles, yes — be ready to discuss indexes, query plans, and why a query is slow.
Are window functions commonly asked?
Very. ROW_NUMBER/RANK and running totals appear constantly, especially for data engineering and analytics roles.
How can I practice SQL interviews?
Solve query problems against a real schema and explain your approach. Offersly can generate SQL-focused questions and score your reasoning.
Practice SQL questions with instant AI feedback
Upload your resume, get a personalized mock interview, and see exactly what to improve — free to start.