Top 10 SQL Query Optimization Tips

1. Use Proper Indexing 🚀

Bad Practice:

SELECT * FROM orders 
WHERE customer_email = 'john@email.com';  -- No index on customer_email

Good Practice:

-- Create index on frequently searched column
CREATE INDEX idx_customer_email ON orders(customer_email);

-- Composite index for multiple columns often queried together
CREATE INDEX idx_order_date_status ON orders(order_date, status);

Key Points:

Index columns used in WHERE, JOIN, and ORDER BY clauses
Don't over-index (indexes slow down INSERT/UPDATE operations)
Consider column selectivity when creating indexes

2. Avoid SELECT * 🎯

Bad Practice:

SELECT * FROM users 
JOIN orders ON users.id = orders.user_id;

Good Practice:

SELECT 
    u.id,
    u.name,
    o.order_date,
    o.total_amount
FROM users u
JOIN orders o ON u.id = o.user_id;

Benefits:

Reduces network traffic
Improves query execution time
Makes code maintenance easier

3. Use EXPLAIN/EXPLAIN ANALYZE 📊

EXPLAIN ANALYZE
SELECT customer_id, COUNT(*) 
FROM orders 
WHERE order_date >= '2024-01-01'
GROUP BY customer_id;

What to Look For:

Seq Scan vs. Index Scan
Number of rows examined
Time spent on each operation
Join strategies used

4. Optimize JOINs 🔄

Bad Practice:

SELECT *
FROM orders o
LEFT JOIN order_items oi ON o.id = oi.order_id
LEFT JOIN products p ON oi.product_id = p.id
LEFT JOIN categories c ON p.category_id = c.id;

Good Practice:

SELECT 
    o.id,
    o.order_date,
    p.name,
    c.category_name
FROM orders o
INNER JOIN order_items oi ON o.id = oi.order_id
INNER JOIN products p ON oi.product_id = p.id
INNER JOIN categories c ON p.category_id = c.id
WHERE o.order_date >= '2024-01-01';

Tips:

Use appropriate JOIN types (INNER vs LEFT)
Join order matters - start with the most filtered table
Include WHERE clauses before JOINs when possible

5. Use EXISTS Instead of IN for Large Datasets 🎯

Bad Practice:

SELECT * FROM orders 
WHERE customer_id IN (
    SELECT customer_id 
    FROM customers 
    WHERE country = 'USA'
);

Good Practice:

SELECT * FROM orders o
WHERE EXISTS (
    SELECT 1 
    FROM customers c
    WHERE c.customer_id = o.customer_id 
    AND c.country = 'USA'
);

6. Optimize GROUP BY Operations 📊

Bad Practice:

SELECT 
    DATE(created_at),
    COUNT(*),
    SUM(amount)
FROM transactions
GROUP BY DATE(created_at);

Good Practice:

SELECT 
    DATE_TRUNC('day', created_at),
    COUNT(*),
    SUM(amount)
FROM transactions
GROUP BY DATE_TRUNC('day', created_at);

7. Use LIMIT with OFFSET Carefully 📑

Bad Practice:

-- Page 1000 of 10 records each
SELECT * FROM products
ORDER BY name
OFFSET 10000 LIMIT 10;

Good Practice:

-- Use keyset pagination
SELECT * FROM products
WHERE id > (
    SELECT id FROM products
    ORDER BY id
    LIMIT 1 OFFSET 9990
)
ORDER BY id
LIMIT 10;

8. Avoid Functions in WHERE Clauses ⚡

Bad Practice:

SELECT * FROM users
WHERE LOWER(email) = 'john@email.com';

Good Practice:

SELECT * FROM users
WHERE email = LOWER('john@email.com');

9. Use UNION ALL Instead of UNION When Possible 🔄

Bad Practice:

SELECT amount FROM orders_2023
UNION
SELECT amount FROM orders_2024;

Good Practice:

SELECT amount FROM orders_2023
UNION ALL
SELECT amount FROM orders_2024;

Why:

UNION ALL skips the duplicate removal process
Much faster when you know there are no duplicates
Or when duplicates are acceptable

10. Optimize Subqueries 📦

Bad Practice:

SELECT *,
    (SELECT COUNT(*) 
     FROM order_items oi 
     WHERE oi.order_id = o.id) as items_count
FROM orders o;

Good Practice:

SELECT 
    o.*,
    COALESCE(oi.items_count, 0) as items_count
FROM orders o
LEFT JOIN (
    SELECT order_id, COUNT(*) as items_count
    FROM order_items
    GROUP BY order_id
) oi ON o.id = oi.order_id;

Bonus: Performance Monitoring Tips 📊

Track Query Execution Time

SELECT 
    query,
    calls,
    total_exec_time / calls as avg_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Monitor Table Statistics

SELECT 
    schemaname,
    relname,
    seq_scan,
    seq_tup_read,
    idx_scan,
    n_tup_ins,
    n_tup_upd,
    n_tup_del
FROM pg_stat_user_tables;

Check Index Usage

SELECT 
    schemaname,
    relname,
    indexrelname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes;

Remember:

Always test with real-world data volumes
Monitor query performance in production
Regular maintenance of statistics and indexes
Document optimization decisions