Table of Contents
ToggleIntroduction
In today’s data-driven world, SQL (Structured Query Language) is an essential skill for data analysts. Whether you’re preparing for your first job interview or looking to advance your career, mastering SQL can significantly enhance your ability to extract, manipulate, and analyze data. This blog has a comprehensive set of SQL interview questions tailored specifically for data analysts. These questions will help you demonstrate your SQL proficiency and problem-solving abilities. Let’s dive in and explore the key SQL concepts and techniques that every data analyst should know!
Don't miss out on your chance to work with the best
Apply for top global job opportunities today!
Most Common SQL Interview Questions for Data Analysts
1. What is a Database Management System (DBMS)?
A Database Management System (DBMS) is software that facilitates the management, storage, retrieval, and organization of data in a structured format. It provides tools for performing CRUD (Create, Read, Update, Delete) operations on a database, acting as a bridge between users and the database, thus simplifying data handling without needing to understand the complexities of data storage and retrieval.
Read in detail: 50 Basic MYSQL Interview Questions
2. What are the fundamental SQL CRUD operations?
The fundamental SQL CRUD operations include CREATE, READ (SELECT), UPDATE, and DELETE. These operations, part of Data Manipulation Language (DML) statements, allow for inserting new records, retrieving data, modifying existing records, and removing records from a database table. Here are basic examples of each operation:
- CREATE: Used to insert new data into a table.
INSERT INTO employees (first_name, last_name, salary)
VALUES ('Pawan', 'Gunjan', 50000);
- READ: Used to retrieve data from a table.
SELECT * FROM employees;
UPDATE: Used to modify existing records.
UPDATE employees
SET salary = 55000
WHERE last_name = 'Gunjan';
- DELETE: Used to remove records from a table.
DELETE FROM employees
WHERE first_name = 'Pawan';
3. How do you insert new records into a table using SQL?
To insert new records into a table, the INSERT INTO statement is used. This statement adds new rows to a table.
Syntax:
INSERT INTO table_name (column1, column2, column3, ...)
VALUES (value1, value2, value3, ...);
Example:
INSERT INTO Customers (CustomerName, City, Country)
VALUES ('Shivang', 'Noida', 'India');
4. How can you filter records in SQL using the WHERE clause?
The WHERE clause in SQL is used to filter records based on specified conditions. It is included in the SELECTstatement to specify the criteria that records must meet to be included in the result set.
Syntax:
SELECT column1, column2, ...
FROM table_name
WHERE condition;
Example:
SELECT * FROM employees
WHERE job_title = 'Developer';
5. How do you sort records in SQL?
Records can be sorted in ascending or descending order using the ORDER BY clause in the SELECTstatement. This clause allows specifying one or more columns by which to sort the result set.
Syntax for ascending order:
SELECT column1, column2, ...
FROM table_name
ORDER BY column_to_sort1 ASC, column_to_sort2 ASC, ...;
Example:
SELECT * FROM Customers
ORDER BY Country ASC;
Syntax for descending order:
SELECT column1, column2, ...
FROM table_name
ORDER BY column_to_sort1 DESC, column_to_sort2 DESC, ...;
Example:
SELECT * FROM Customers
ORDER BY Country DESC;
Read in detail: Introduction to SQL
6. What is the purpose of the GROUP BY clause in SQL?
SELECT column1, function_name(column2)
FROM table_name
GROUP BY column_name(s);
Example:
SELECT AGE, COUNT(Name)
FROM CUSTOMERS
GROUP BY AGE;
7. How do you perform aggregate functions like SUM, COUNT, AVG, and MAX/MIN in SQL?
Aggregate functions in SQL perform calculations on a set of values and return a single value. Common aggregate functions include:
SUM: Calculates the total sum of a numeric column.
SELECT SUM(Cost)
FROM Products;
COUNT: Counts the number of rows or non-null values.
SELECT COUNT(*)
FROM Orders;
AVG: Calculates the average value of a numeric column.
SELECT AVG(Price)
FROM Products;
MAX: Returns the maximum value in a column.
SELECT MAX(Price)
FROM Orders;
MIN: Returns the minimum value in a column.
SELECT MIN(Price)
FROM Products;
8. What is an SQL join operation? Explain different types of joins (INNER, LEFT, RIGHT, FULL).
An SQL join operation combines rows from two or more tables based on a related column. The main types of joins are:
INNER JOIN: Selects rows with matching values in both tables.
SELECT customers.customer_id, orders.order_id
FROM customers
INNER JOIN orders
ON customers.customer_id = orders.customer_id;
LEFT JOIN: Returns all rows from the left table and matched rows from the right table.
SELECT departments.department_name, employees.first_name
FROM departments
LEFT JOIN employees
ON departments.department_id = employees.department_id;
RIGHT JOIN: Returns all rows from the right table and matched rows from the left table.
SELECT employees.first_name, orders.order_id
FROM employees
RIGHT JOIN orders
ON employees.employee_id = orders.employee_id;
FULL JOIN: Combines the results of both LEFT JOIN and RIGHT JOIN.
SELECT customers.customer_id, orders.order_id
FROM customers
FULL JOIN orders
ON customers.customer_id = orders.customer_id;
9. How can you retrieve data from multiple related tables using SQL?
To retrieve data from multiple related tables, use the SELECT statement along with JOIN operations. Joins are used to fetch records from multiple tables based on common fields.
10. What is a subquery in SQL? How can you use it to retrieve specific data?
A subquery is a query nested within another query. It is typically embedded in the WHERE clause of an SQL query but can also be used in the HAVING and FROM clauses. Subqueries are utilized with SELECT, INSERT, DELETE, and UPDATE statements, often involving comparison or equality operators like >=, =, <=, and the LIKE operator.
Example 1: Subquery in the SELECT Clause
SELECT customer_name,
(SELECT COUNT(*) FROM orders WHERE orders.customer_id = customers.customer_id) AS order_count
FROM customers;
Example 2: Subquery in the WHERE Clause
SELECT employee_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
Example 3: Subquery in the FROM Clause (Derived Tables)
SELECT category, SUM(sales) AS total_sales
FROM (SELECT product_id, category, sales FROM products) AS derived_table
GROUP BY category;
11. Can you give an example of using a subquery in combination with an IN or EXISTS condition?
Subqueries can be combined with IN or EXISTS conditions. Here’s an example using a subquery with IN to find data from the geeks_data table for those in the computer science department, using the geeks_dept table.
Using a Subquery with IN
SELECT f_name, l_name
FROM geeks_data
WHERE dept IN
(SELECT dep_name FROM geeks_dept WHERE dept_id = 1);
Using a Subquery with EXISTS
SELECT DISTINCT store_t
FROM store
WHERE EXISTS (SELECT * FROM city_store WHERE city_store.store_t = store.store_t);
12. What is the purpose of the HAVING clause in SQL? How is it different from the WHERE clause?
The HAVING clause in SQL is used to filter the results of a GROUP BY query based on aggregate functions applied to grouped columns. It allows filtering of groups of rows that meet specific conditions after grouping has been performed. The HAVING clause is typically used with aggregate functions like SUM, COUNT, AVG, MAX, or MIN.
Differences between HAVING and WHERE clauses:
HAVING Clause:
- Filters groups of rows after grouping.
- Operates on the results of aggregate functions applied to grouped columns.
- Typically used with GROUP BY queries.
WHERE Clause:
- Filters rows before grouping.
- Operates on individual rows in the table and is applied before grouping and aggregation.
- Can be used with any SQL query, whether it involves grouping or not.
13. How do you use the UNION and UNION ALL operators in SQL?
The UNION and UNION ALL operators in SQL are used to combine the result sets of multiple SELECT statements into a single result set.
UNION Operator:
- Returns only distinct rows from the combined result sets, eliminating duplicates.
- Used when you want to combine result sets and remove duplicate rows.
Syntax:
SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;
Example:
SELECT name, roll_number
FROM student
UNION
SELECT name, roll_number
FROM marks;
UNION ALL Operator:
- Returns all rows from the combined result sets, including duplicates.
- Used when you want to combine result sets and include duplicate rows.
Syntax:
SELECT column1, column2, ...
FROM table1
UNION ALL
SELECT column1, column2, ...
FROM table2;
Example:
SELECT name, roll_number
FROM student
UNION ALL
SELECT name, roll_number
FROM marks;
14. Explain the concept of database normalization and its importance.
Database normalization is the process of organizing data in a database to reduce redundancy and improve data integrity. It involves structuring the columns and tables to ensure dependencies are correctly implemented using database constraints.
Importance of normalization:
- Eliminates redundant data.
- Reduces the chances of data errors.
- Saves disk space.
- Enhances performance.
- Improves data integrity and consistency.
15. Can you list and briefly describe the normal forms (1NF, 2NF, 3NF) in SQL?
Normalization in databases involves organizing data to minimize redundancy and improve data integrity. The most common normal forms are 1NF, 2NF, and 3NF.
Here’s a brief overview of each:
First Normal Form (1NF): Ensures that each table cell contains only a single value and each column has a unique name. This form eliminates duplicate data and simplifies queries by ensuring data is stored at its most granular level.
Second Normal Form (2NF): Builds on 1NF by eliminating partial dependencies, meaning that non-key attributes must be fully dependent on the entire primary key. This reduces redundancy and anomalies further.
Third Normal Form (3NF): Extends 2NF by ensuring that all non-key attributes are independent of each other and only dependent on the primary key. This means that each column should be directly related to the primary key, not to other columns.
16. Explain window functions in SQL.
Window functions in SQL allow for complex calculations across a set of table rows related to the current row without the need for self-joins or subqueries.
Syntax:
SELECT col_name1,
window_function(col_name2) OVER([PARTITION BY col_name1] [ORDER BY col_name3]) AS new_col
FROM table_name;
Example:
SELECT department,
AVG(salary) OVER(PARTITION BY department ORDER BY employee_id) AS avg_salary
FROM employees;
Also Read: Why Are NoSQL Databases Useful?
17. What is the difference between Window Functions and Aggregate Functions?
Aspect | Window Functions | Aggregate Functions |
---|---|---|
Scope of Operation | Operate within a specific “window” or subset of rows defined by the OVER() clause. | Operate on the entire result set. |
Result per Row | Return a result for each row in the result set based on its specific window. | Return a single result for the entire set of rows. |
Detail Retention | Provide both an aggregate result and retain details of individual rows. | Provide a summary of the entire dataset, often losing detail about individual rows. |
Use of OVER() Clause | Require the use of the OVER() clause to specify the window’s characteristics. | Do not use the OVER() clause. |
Customization | Can be customized based on specific criteria, such as partitioning and ordering. | Cannot be customized in the same way; apply to the entire dataset uniformly. |
Example | AVG(salary) OVER(PARTITION BY department ORDER BY employee_id) | AVG(salary) |
18. What are primary keys and foreign keys in SQL? Why are they important?
Primary Key: A column or set of columns that uniquely identifies each row in a table. It ensures data uniqueness and cannot contain NULL values.
Importance:
- Ensures uniqueness
- Optimizes queries
- Maintains data integrity
- Establishes relationships
- Facilitates data retrieval
Foreign Key: A column or set of columns in one table that references the primary key columns of another table, establishing a link between the two tables.
Importance:
- Establishes relationships
- Ensures data consistency
- Enhances query efficiency
- Maintains referential integrity
- Supports cascade actions
19. Describe the concept of a database transaction. Why is it important to maintain data integrity?
A database transaction is a sequence of operations performed as a single logical unit of work. Transactions ensure that the database remains in a consistent state even in the presence of system failures or concurrent operations.
Importance:
- Maintains data integrity by ensuring that all operations within a transaction are completed successfully before committing changes.
- Enforces ACID properties (Atomicity, Consistency, Isolation, Durability), ensuring reliable and consistent data management.
20. Explain how NULL values are handled in SQL queries, and how you can use functions like IS NULL and IS NOT NULL.
NULL represents the absence of a value in a database column. SQL provides operators like IS NULL and IS NOT NULL to handle NULL values effectively.
IS NULL: Checks if a column contains a NULL value. Syntax:
SELECT column_name(s) FROM table_name WHERE column_name IS NULL;
Example:
SELECT * FROM employees WHERE mid_name IS NULL;
IS NOT NULL: Checks if a column does not contain a NULL value.
Syntax:
SELECT column_name(s) FROM table_name WHERE column_name IS NOT NULL;
Example:
SELECT * FROM employees WHERE first_name IS NOT NULL;
21. What is the difference between normalization and denormalization in database design.
Aspect | Normalization | Denormalization |
---|---|---|
Purpose | Reduce data redundancy and improve data integrity. | Improve query performance by adding redundancy. |
Data Redundancy | Eliminates or reduces redundancy. | Adds redundancy. |
Data Integrity | Maintains high data integrity. | May compromise data integrity. |
Number of Tables | Increases the number of tables. | Decreases the number of tables. |
Query Performance | May lead to more complex queries and joins. | Simplifies queries and improves performance. |
Disk Space Usage | Optimizes disk space usage. | Optimizes disk space usage. |
Data Consistency | Ensures data consistency. | May lead to data inconsistency. |
Use Case | Suitable for OLTP (Online Transaction Processing) systems. | Suitable for OLAP (Online Analytical Processing) systems. |
22. How do you subset or filter data in SQL?
To subset or filter data in SQL, you typically use the WHERE clause. This clause allows you to specify conditions that the data must meet to be included in the result set. Here are some common ways to filter data:
Basic Filtering with WHERE
Example:
SELECT * FROM employees
WHERE department = 'Sales';
This query retrieves all rows from the employees table where the department is ‘Sales’.
Using Comparison Operators
Example:
SELECT * FROM products
WHERE price > 100;
This query selects products with a price greater than 100.
Using Logical Operators (AND, OR, NOT)
Example:
SELECT * FROM orders
WHERE order_date >= '2024-01-01' AND status = 'Shipped';
This query retrieves orders placed on or after January 1, 2024, that have been shipped.
Using IN Oprator
Example:
SELECT * FROM customers
WHERE country IN ('USA', 'Canada', 'UK');
This query selects customers from the USA, Canada, or the UK.
Using BETWEEN Operator
Example:
SELECT * FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31';
This query retrieves sales made between January 1, 2024, and December 31, 2024.
Using LIKE Operator for Pattern Matching
Example:
SELECT * FROM employees
WHERE first_name LIKE 'J%';
This query selects employees whose first name starts with ‘J’.
Using IS NULL and IS NOT NULL
Example:
SELECT * FROM employees
WHERE middle_name IS NULL;
This query retrieves employees who do not have a middle name.
By using these various operators and clauses, you can effectively filter and subset your data in SQL to meet specific criteria
23. How are Union, Intersect, and Except used in SQL?
In SQL, the UNION, INTERSECT, and EXCEPT operators are used to combine the results of two or more SELECT queries. Each operator serves a different purpose:
UNION
The UNION operator combines the result sets of two or more SELECT queries and returns only distinct rows. It removes duplicate rows from the combined result set.
Syntax:
SELECT column1, column2, ...
FROM table1
UNION
SELECT column1, column2, ...
FROM table2;
Example:
SELECT name, roll_number
FROM students
UNION
SELECT name, roll_number
FROM alumni;
This query combines the names and roll numbers from the students and alumni tables, removing any duplicates.
UNION ALL
The UNION ALL operator also combines the result sets of two or more SELECT queries but includes all duplicates.
Syntax:
SELECT column1, column2, ...
FROM table1
UNION ALL
SELECT column1, column2, ...
FROM table2;
Example:
SELECT name, roll_number
FROM students
UNION ALL
SELECT name, roll_number
FROM alumni;
This query combines the names and roll numbers from the students and alumni tables, including duplicates.
INTERSECT
The INTERSECT operator returns only the rows that are common to the result sets of two or more SELECT queries.
Syntax:
SELECT column1, column2, ...
FROM table1
INTERSECT
SELECT column1, column2, ...
FROM table2;
Example:
SELECT name
FROM students
INTERSECT
SELECT name
FROM scholarship_recipients;
This query returns the names of students who are also scholarship recipients.
EXCEPT
The EXCEPT operator returns the rows from the first SELECT query that are not present in the result set of the second SELECT query.
Syntax:
SELECT column1, column2, ...
FROM table1
EXCEPT
SELECT column1, column2, ...
FROM table2;
Example:
SELECT name
FROM students
EXCEPT
SELECT name
FROM graduates;
This query returns the names of students who have not yet graduated.
These operators are powerful tools for combining and filtering data from multiple queries, allowing you to perform complex data retrieval tasks efficiently.
24. Using the product_price table, write an SQL query to find the record with the fourth-highest market price.
To find the record with the fourth-highest market price in the product_price table, you can use the ORDER BY clause along with the LIMIT clause (or its equivalent) and a subquery. Here’s how you can do it:
Using LIMIT and OFFSET (MySQL, PostgreSQL)
SELECT *
FROM product_price
ORDER BY market_price DESC
LIMIT 1 OFFSET 3;
Using a Subquery (SQL Server)
SELECT TOP 1 market_price
FROM (
SELECT market_price,
ROW_NUMBER() OVER (ORDER BY market_price DESC) AS row_num
FROM product_price
) AS subquery
WHERE row_num = 4;
Using a Subquery (Oracle)
SELECT market_price
FROM (
SELECT market_price,
RANK() OVER (ORDER BY market_price DESC) AS rank
FROM product_price
)
WHERE rank = 4;
These queries will help you retrieve the record with the fourth-highest market price from the product_pricetable. The specific syntax may vary slightly depending on the SQL database you are using.
25. How do you write a stored procedure in SQL?
Writing a stored procedure in SQL involves creating a reusable set of SQL statements that can be executed as a single unit. Stored procedures can accept parameters, perform complex operations, and return results. Here’s a general guide on how to create a stored procedure in SQL:
Basic Syntax
The syntax for creating a stored procedure can vary slightly depending on the SQL database system you are using (e.g., MySQL, SQL Server, PostgreSQL, Oracle). Here’s a basic example for each:
MySQL
DELIMITER //
CREATE PROCEDURE procedure_name (IN parameter1 datatype, OUT parameter2 datatype)
BEGIN
-- SQL statements
SELECT column1, column2 INTO parameter2
FROM table_name
WHERE condition;
END //
DELIMITER ;
SQL Server
CREATE PROCEDURE procedure_name
@parameter1 datatype,
@parameter2 datatype OUTPUT
AS
BEGIN
-- SQL statements
SELECT column1, column2
FROM table_name
WHERE condition;
-- Set output parameter
SET @parameter2 = (SELECT column1 FROM table_name WHERE condition);
END;
PostgreSQL
CREATE OR REPLACE PROCEDURE procedure_name (IN parameter1 datatype, OUT parameter2 datatype)
LANGUAGE plpgsql
AS $$
BEGIN
-- SQL statements
SELECT column1, column2 INTO parameter2
FROM table_name
WHERE condition;
END;
$$;
Oracle
CREATE OR REPLACE PROCEDURE procedure_name (parameter1 IN datatype, parameter2 OUT datatype) AS
BEGIN
-- SQL statements
SELECT column1, column2 INTO parameter2
FROM table_name
WHERE condition;
END;
Also Read: Top 50 Advanced MySQL Interview Questions
Conclusion
Preparing for an SQL interview can be very easy with the right set of questions. A solid understanding of SQL concepts can give you the confidence to showcase your skills and knowledge. This blog has covered the most common SQL interview questions that are crucial for data analysts. By practicing these questions and understanding the underlying principles, you’ll be well-equipped to tackle any SQL-related challenge that comes your way.
Take control of your career and land your dream job
Sign up with us now and start applying for the best opportunities!