Database Normalization Example: Online Bookstore
Original Unnormalized Table (Raw Data)
| order_id | book_title | book_price | customer_name | customer_email | customer_phone | category_name | category_description |
|---|---|---|---|---|---|---|---|
| 1 | The Great Gatsby | 29.99 | John Doe | john@email.com | 123-456-7890 | Fiction | Novels and literary works |
| 1 | To Kill a Mockingbird | 24.99 | John Doe | john@email.com | 123-456-7890 | Fiction | Novels and literary works |
| 2 | Data Science Basics | 49.99 | Jane Smith | jane@email.com | 098-765-4321 | Technical | Educational and reference books |
After Normalization
First Normal Form (1NF)
Customers Table:
| customer_id | customer_name | customer_email | customer_phone |
|---|---|---|---|
| 1 | John Doe | john@email.com | 123-456-7890 |
| 2 | Jane Smith | jane@email.com | 098-765-4321 |
Orders Table:
| order_id | book_title | book_price | customer_id | category_name | category_description |
|---|---|---|---|---|---|
| 1 | The Great Gatsby | 29.99 | 1 | Fiction | Novels and literary works |
| 1 | To Kill a Mockingbird | 24.99 | 1 | Fiction | Novels and literary works |
| 2 | Data Science Basics | 49.99 | 2 | Technical | Educational and reference books |
Second Normal Form (2NF)
Books Table:
| book_id | book_title | book_price | category_name | category_description |
|---|---|---|---|---|
| 1 | The Great Gatsby | 29.99 | Fiction | Novels and literary works |
| 2 | To Kill a Mockingbird | 24.99 | Fiction | Novels and literary works |
| 3 | Data Science Basics | 49.99 | Technical | Educational and reference books |
Orders Table:
| order_id | book_id | customer_id |
|---|---|---|
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 2 | 3 | 2 |
Third Normal Form (3NF)
Categories Table:
| category_id | category_name | category_description |
|---|---|---|
| 1 | Fiction | Novels and literary works |
| 2 | Technical | Educational and reference books |
Books Table:
| book_id | book_title | book_price | category_id |
|---|---|---|---|
| 1 | The Great Gatsby | 29.99 | 1 |
| 2 | To Kill a Mockingbird | 24.99 | 1 |
| 3 | Data Science Basics | 49.99 | 2 |
Normalization Process Summary:
-
Unnormalized → 1NF:
- Separated customer data into its own table
- Created customer_id as a reference
-
1NF → 2NF:
- Separated book data into its own table
- Created book_id as a reference
- Simplified orders table to just relationships
-
2NF → 3NF:
- Separated category data into its own table
- Removed transitive dependency of category description on book
Unnormalized Table
CREATE TABLE book_orders (
order_id INT,
book_title VARCHAR(100),
book_price DECIMAL(10,2),
customer_name VARCHAR(100),
customer_email VARCHAR(100),
customer_phone VARCHAR(20),
category_name VARCHAR(50),
category_description TEXT
);
INSERT INTO book_orders VALUES
(1, 'The Great Gatsby', 29.99, 'John Doe', 'john@email.com', '123-456-7890', 'Fiction', 'Novels and literary works'),
(1, 'To Kill a Mockingbird', 24.99, 'John Doe', 'john@email.com', '123-456-7890', 'Fiction', 'Novels and literary works'),
(2, 'Data Science Basics', 49.99, 'Jane Smith', 'jane@email.com', '098-765-4321', 'Technical', 'Educational and reference books');
First Normal Form (1NF)
Removing repeating groups and ensuring atomic values
-- Customers Table
CREATE TABLE customers_1nf (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
customer_email VARCHAR(100),
customer_phone VARCHAR(20)
);
INSERT INTO customers_1nf VALUES
(1, 'John Doe', 'john@email.com', '123-456-7890'),
(2, 'Jane Smith', 'jane@email.com', '098-765-4321');
-- Orders Table
CREATE TABLE orders_1nf (
order_id INT,
book_title VARCHAR(100),
book_price DECIMAL(10,2),
customer_id INT,
category_name VARCHAR(50),
category_description TEXT,
PRIMARY KEY (order_id, book_title)
);
INSERT INTO orders_1nf VALUES
(1, 'The Great Gatsby', 29.99, 1, 'Fiction', 'Novels and literary works'),
(1, 'To Kill a Mockingbird', 24.99, 1, 'Fiction', 'Novels and literary works'),
(2, 'Data Science Basics', 49.99, 2, 'Technical', 'Educational and reference books');
Second Normal Form (2NF)
Removing partial dependencies
-- Books Table
CREATE TABLE books_2nf (
book_id INT PRIMARY KEY,
book_title VARCHAR(100),
book_price DECIMAL(10,2),
category_name VARCHAR(50),
category_description TEXT
);
INSERT INTO books_2nf VALUES
(1, 'The Great Gatsby', 29.99, 'Fiction', 'Novels and literary works'),
(2, 'To Kill a Mockingbird', 24.99, 'Fiction', 'Novels and literary works'),
(3, 'Data Science Basics', 49.99, 'Technical', 'Educational and reference books');
-- Orders Table
CREATE TABLE orders_2nf (
order_id INT,
book_id INT,
customer_id INT,
PRIMARY KEY (order_id, book_id)
);
INSERT INTO orders_2nf VALUES
(1, 1, 1),
(1, 2, 1),
(2, 3, 2);
Third Normal Form (3NF)
Removing transitive dependencies
-- Categories Table
CREATE TABLE categories_3nf (
category_id INT PRIMARY KEY,
category_name VARCHAR(50),
category_description TEXT
);
INSERT INTO categories_3nf VALUES
(1, 'Fiction', 'Novels and literary works'),
(2, 'Technical', 'Educational and reference books');
-- Books Table
CREATE TABLE books_3nf (
book_id INT PRIMARY KEY,
book_title VARCHAR(100),
book_price DECIMAL(10,2),
category_id INT,
FOREIGN KEY (category_id) REFERENCES categories_3nf(category_id)
);
INSERT INTO books_3nf VALUES
(1, 'The Great Gatsby', 29.99, 1),
(2, 'To Kill a Mockingbird', 24.99, 1),
(3, 'Data Science Basics', 49.99, 2);
-- Orders Table remains the same as 2NF
Final Database Schema
This shows how the data is properly organized with minimal redundancy:
- Customers Table: Stores customer information
- Categories Table: Stores book categories
- Books Table: Stores book information with category reference
- Orders Table: Links customers to their book orders
To retrieve a customer's order information:
SELECT
o.order_id,
c.customer_name,
b.book_title,
b.book_price,
cat.category_name
FROM orders_2nf o
JOIN customers_1nf c ON o.customer_id = c.customer_id
JOIN books_3nf b ON o.book_id = b.book_id
JOIN categories_3nf cat ON b.category_id = cat.category_id
ORDER BY o.order_id;