Skip to main content

Database Normalization Example: Online Bookstore

Original Unnormalized Table (Raw Data)

order_id book_title book_price customer_name customer_email customer_phone category_name category_description
1 The Great Gatsby 29.99 John Doe john@email.com 123-456-7890 Fiction Novels and literary works
1 To Kill a Mockingbird 24.99 John Doe john@email.com 123-456-7890 Fiction Novels and literary works
2 Data Science Basics 49.99 Jane Smith jane@email.com 098-765-4321 Technical Educational and reference books

After Normalization

First Normal Form (1NF)

Customers Table:

customer_id customer_name customer_email customer_phone
1 John Doe john@email.com 123-456-7890
2 Jane Smith jane@email.com 098-765-4321

Orders Table:

order_id book_title book_price customer_id category_name category_description
1 The Great Gatsby 29.99 1 Fiction Novels and literary works
1 To Kill a Mockingbird 24.99 1 Fiction Novels and literary works
2 Data Science Basics 49.99 2 Technical Educational and reference books

Second Normal Form (2NF)

Books Table:

book_id book_title book_price category_name category_description
1 The Great Gatsby 29.99 Fiction Novels and literary works
2 To Kill a Mockingbird 24.99 Fiction Novels and literary works
3 Data Science Basics 49.99 Technical Educational and reference books

Orders Table:

order_id book_id customer_id
1 1 1
1 2 1
2 3 2

Third Normal Form (3NF)

Categories Table:

category_id category_name category_description
1 Fiction Novels and literary works
2 Technical Educational and reference books

Books Table:

book_id book_title book_price category_id
1 The Great Gatsby 29.99 1
2 To Kill a Mockingbird 24.99 1
3 Data Science Basics 49.99 2

Normalization Process Summary:

  1. Unnormalized → 1NF:

    • Separated customer data into its own table
    • Created customer_id as a reference
  2. 1NF → 2NF:

    • Separated book data into its own table
    • Created book_id as a reference
    • Simplified orders table to just relationships
  3. 2NF → 3NF:

    • Separated category data into its own table
    • Removed transitive dependency of category description on book

Unnormalized Table

CREATE TABLE book_orders (
    order_id INT,
    book_title VARCHAR(100),
    book_price DECIMAL(10,2),
    customer_name VARCHAR(100),
    customer_email VARCHAR(100),
    customer_phone VARCHAR(20),
    category_name VARCHAR(50),
    category_description TEXT
);

INSERT INTO book_orders VALUES
(1, 'The Great Gatsby', 29.99, 'John Doe', 'john@email.com', '123-456-7890', 'Fiction', 'Novels and literary works'),
(1, 'To Kill a Mockingbird', 24.99, 'John Doe', 'john@email.com', '123-456-7890', 'Fiction', 'Novels and literary works'),
(2, 'Data Science Basics', 49.99, 'Jane Smith', 'jane@email.com', '098-765-4321', 'Technical', 'Educational and reference books');

First Normal Form (1NF)

Removing repeating groups and ensuring atomic values

-- Customers Table
CREATE TABLE customers_1nf (
    customer_id INT PRIMARY KEY,
    customer_name VARCHAR(100),
    customer_email VARCHAR(100),
    customer_phone VARCHAR(20)
);

INSERT INTO customers_1nf VALUES
(1, 'John Doe', 'john@email.com', '123-456-7890'),
(2, 'Jane Smith', 'jane@email.com', '098-765-4321');

-- Orders Table
CREATE TABLE orders_1nf (
    order_id INT,
    book_title VARCHAR(100),
    book_price DECIMAL(10,2),
    customer_id INT,
    category_name VARCHAR(50),
    category_description TEXT,
    PRIMARY KEY (order_id, book_title)
);

INSERT INTO orders_1nf VALUES
(1, 'The Great Gatsby', 29.99, 1, 'Fiction', 'Novels and literary works'),
(1, 'To Kill a Mockingbird', 24.99, 1, 'Fiction', 'Novels and literary works'),
(2, 'Data Science Basics', 49.99, 2, 'Technical', 'Educational and reference books');

Second Normal Form (2NF)

Removing partial dependencies

-- Books Table
CREATE TABLE books_2nf (
    book_id INT PRIMARY KEY,
    book_title VARCHAR(100),
    book_price DECIMAL(10,2),
    category_name VARCHAR(50),
    category_description TEXT
);

INSERT INTO books_2nf VALUES
(1, 'The Great Gatsby', 29.99, 'Fiction', 'Novels and literary works'),
(2, 'To Kill a Mockingbird', 24.99, 'Fiction', 'Novels and literary works'),
(3, 'Data Science Basics', 49.99, 'Technical', 'Educational and reference books');

-- Orders Table
CREATE TABLE orders_2nf (
    order_id INT,
    book_id INT,
    customer_id INT,
    PRIMARY KEY (order_id, book_id)
);

INSERT INTO orders_2nf VALUES
(1, 1, 1),
(1, 2, 1),
(2, 3, 2);

Third Normal Form (3NF)

Removing transitive dependencies

-- Categories Table
CREATE TABLE categories_3nf (
    category_id INT PRIMARY KEY,
    category_name VARCHAR(50),
    category_description TEXT
);

INSERT INTO categories_3nf VALUES
(1, 'Fiction', 'Novels and literary works'),
(2, 'Technical', 'Educational and reference books');

-- Books Table
CREATE TABLE books_3nf (
    book_id INT PRIMARY KEY,
    book_title VARCHAR(100),
    book_price DECIMAL(10,2),
    category_id INT,
    FOREIGN KEY (category_id) REFERENCES categories_3nf(category_id)
);

INSERT INTO books_3nf VALUES
(1, 'The Great Gatsby', 29.99, 1),
(2, 'To Kill a Mockingbird', 24.99, 1),
(3, 'Data Science Basics', 49.99, 2);

-- Orders Table remains the same as 2NF

Final Database Schema

This shows how the data is properly organized with minimal redundancy:

  1. Customers Table: Stores customer information
  2. Categories Table: Stores book categories
  3. Books Table: Stores book information with category reference
  4. Orders Table: Links customers to their book orders

To retrieve a customer's order information:

SELECT 
    o.order_id,
    c.customer_name,
    b.book_title,
    b.book_price,
    cat.category_name
FROM orders_2nf o
JOIN customers_1nf c ON o.customer_id = c.customer_id
JOIN books_3nf b ON o.book_id = b.book_id
JOIN categories_3nf cat ON b.category_id = cat.category_id
ORDER BY o.order_id;