slug
type
status
category
summary
date
tags
password
icon
Complete codes
First Normal Form (1NF)
1NF dictates that each column of a table must contain atomic (indivisible) values, and there should be no repeating groups or arrays. Essentially, every cell must hold a single value, and each record must be unique.
Example:
Consider a table
Order
that records customer orders:OrderID | CustomerName | Products |
101 | Alice | Widget A, Widget B |
102 | Bob | Widget C |
103 | Charlie | Widget A, Widget D |
In this table, the
Products
column contains multiple items in a single cell, violating 1NF because it holds non-atomic values.To achieve 1NF, we need to ensure that each cell contains only one value:
OrderID | CustomerName | Product |
101 | Alice | Widget A |
101 | Alice | Widget B |
102 | Bob | Widget C |
103 | Charlie | Widget A |
103 | Charlie | Widget D |
Now, each cell contains a single value, and the table adheres to 1NF.
Second Normal Form (2NF)
2NF requires that the table is in 1NF and that all non-key attributes are fully functionally dependent on the primary key. There should be no partial dependency of any column on a portion of the primary key.
Example:
Using the 1NF-compliant
Order
table, suppose we have a composite primary key consisting of OrderID
and Product
, and we add a CustomerAddress
column:OrderID | CustomerName | CustomerAddress | Product |
101 | Alice | 123 Maple St | Widget A |
101 | Alice | 123 Maple St | Widget B |
102 | Bob | 456 Oak Ave | Widget C |
103 | Charlie | 789 Pine Rd | Widget A |
103 | Charlie | 789 Pine Rd | Widget D |
In this table,
CustomerName
and CustomerAddress
depend only on OrderID
, not on the entire composite primary key (OrderID
, Product
). This partial dependency violates 2NF.To achieve 2NF, we decompose the table into two tables:
Table 1: Orders
OrderID | CustomerName | CustomerAddress |
101 | Alice | 123 Maple St |
102 | Bob | 456 Oak Ave |
103 | Charlie | 789 Pine Rd |
Table 2: OrderDetails
OrderID | Product |
101 | Widget A |
101 | Widget B |
102 | Widget C |
103 | Widget A |
103 | Widget D |
Now, all non-key attributes in each table fully depend on the primary key, satisfying 2NF.
Third Normal Form (3NF)
3NF requires that the table is in 2NF and that all non-key attributes are not only fully dependent on the primary key but also non-transitively dependent. This means no non-key attribute depends on another non-key attribute.
Example:
Suppose we add a
Region
column to the Orders
table, where Region
depends on CustomerAddress
:Table: Orders
OrderID | CustomerName | CustomerAddress | Region |
101 | Alice | 123 Maple St | North |
102 | Bob | 456 Oak Ave | East |
103 | Charlie | 789 Pine Rd | North |
Here,
Region
is determined by CustomerAddress
, not directly by OrderID
. This transitive dependency violates 3NF.To achieve 3NF, we separate the
Region
information into another table:Table 1: Orders
OrderID | CustomerName | CustomerAddress |
101 | Alice | 123 Maple St |
102 | Bob | 456 Oak Ave |
103 | Charlie | 789 Pine Rd |
Table 2: Addresses
CustomerAddress | Region |
123 Maple St | North |
456 Oak Ave | East |
789 Pine Rd | North |
Now, the
Orders
table is in 3NF because all non-key attributes (CustomerName
, CustomerAddress
) depend only on the primary key (OrderID
), and there are no transitive dependencies.Summary:
- 1NF eliminates repeating groups and ensures atomicity.
- 2NF removes partial dependencies by ensuring all non-key attributes depend on the entire primary key.
- 3NF eliminates transitive dependencies so that non-key attributes depend only on the primary key.
By progressively refining the database structure according to these normal forms, we reduce redundancy and improve data integrity.
第一范式(1NF)
1NF规定,表的每一列必须包含原子(不可再分)的值,且不应存在重复的组或数组。也就是说,每个单元格必须只包含一个值,每条记录必须是唯一的。
示例:
考虑一个记录客户订单的
Order
表:OrderID | CustomerName | Products |
101 | Alice | Widget A, Widget B |
102 | Bob | Widget C |
103 | Charlie | Widget A, Widget D |
在这个表中,
Products
列在单个单元格中包含了多个项目,违反了1NF,因为它包含了非原子值。为实现1NF,我们需要确保每个单元格只包含一个值:
OrderID | CustomerName | Product |
101 | Alice | Widget A |
101 | Alice | Widget B |
102 | Bob | Widget C |
103 | Charlie | Widget A |
103 | Charlie | Widget D |
现在,每个单元格都只包含一个值,表格符合1NF。
第二范式(2NF)
2NF要求表格在1NF的基础上,所有非键属性完全依赖于主键。不存在任何属性对主键的一部分的部分依赖。
示例:
使用符合1NF的
Order
表,假设我们有一个由OrderID
和Product
组成的复合主键,并添加一个CustomerAddress
列:OrderID | CustomerName | CustomerAddress | Product |
101 | Alice | 123 Maple St | Widget A |
101 | Alice | 123 Maple St | Widget B |
102 | Bob | 456 Oak Ave | Widget C |
103 | Charlie | 789 Pine Rd | Widget A |
103 | Charlie | 789 Pine Rd | Widget D |
在这个表中,
CustomerName
和CustomerAddress
只依赖于OrderID
,而不是整个复合主键(OrderID
,Product
)。这种部分依赖违反了2NF。为实现2NF,我们将表分解为两个表:
表1:Orders
OrderID | CustomerName | CustomerAddress |
101 | Alice | 123 Maple St |
102 | Bob | 456 Oak Ave |
103 | Charlie | 789 Pine Rd |
表2:OrderDetails
OrderID | Product |
101 | Widget A |
101 | Widget B |
102 | Widget C |
103 | Widget A |
103 | Widget D |
现在,每个表中的所有非键属性都完全依赖于主键,满足了2NF。
第三范式(3NF)
3NF要求表格在2NF的基础上,所有非键属性不仅完全依赖于主键,而且没有传递依赖。这意味着没有非键属性依赖于另一个非键属性。
示例:
假设我们在
Orders
表中添加一个Region
列,Region
依赖于CustomerAddress
:表:Orders
OrderID | CustomerName | CustomerAddress | Region |
101 | Alice | 123 Maple St | North |
102 | Bob | 456 Oak Ave | East |
103 | Charlie | 789 Pine Rd | North |
这里,
Region
由CustomerAddress
决定,而不是直接由OrderID
决定。这种传递依赖违反了3NF。为实现3NF,我们将
Region
信息分离到另一个表中:表1:Orders
OrderID | CustomerName | CustomerAddress |
101 | Alice | 123 Maple St |
102 | Bob | 456 Oak Ave |
103 | Charlie | 789 Pine Rd |
表2:Addresses
CustomerAddress | Region |
123 Maple St | North |
456 Oak Ave | East |
789 Pine Rd | North |
现在,
Orders
表符合3NF,因为所有非键属性(CustomerName
,CustomerAddress
)都只依赖于主键(OrderID
),并且不存在传递依赖。总结:
- 1NF 消除了重复组,确保了原子性。
- 2NF 通过确保所有非键属性依赖于整个主键,消除了部分依赖。
- 3NF 消除了传递依赖,使非键属性只依赖于主键。
通过按照这些范式逐步优化数据库结构,我们减少了数据冗余,提升了数据完整性。
结合前面的例子,MySQL代码示例
第一范式(1NF)
初始表(存在非原子值,违反1NF):
改进后的表(符合1NF):
第二范式(2NF)
存在部分依赖,违反2NF:
在这个表中,
CustomerName
和 CustomerAddress
只依赖于 OrderID
,而不是复合主键 (OrderID
, Product
),因此违反了2NF。改进后的表(符合2NF):将表分解为 Orders 和 OrderDetails
第三范式(3NF)
存在传递依赖,违反3NF:
在此,
Region
依赖于 CustomerAddress
,而不是直接依赖于主键 OrderID
,因此违反了3NF。改进后的表(符合3NF):将 Region 信息分离到 Addresses 表
如何理解:ALTER TABLE Orders ADD FOREIGN KEY (CustomerAddress) REFERENCES Addresses(CustomerAddress);?
Certainly! I'd be happy to explain that SQL statement in detail, step by step. Since you're new to this topic ("I am green here"), I'll start from the basics and build up to how this statement works and why it's used.
Context of the Statement
Before diving into the specific SQL command, let's briefly understand the context in which it's used:
- You have an
Orders
table that stores information about orders placed by customers.
- You have an
Addresses
table that stores customer addresses along with their associated regions.
- You want to ensure that every
CustomerAddress
in theOrders
table corresponds to a valid address in theAddresses
table.
The SQL Statement
Breaking Down the Statement
Let's dissect this statement piece by piece.
1. ALTER TABLE Orders
- Purpose: This part tells the database that you want to modify the structure of the table named
Orders
.
- Usage: The
ALTER TABLE
command is used to add, delete, or modify columns and constraints in an existing table.
2. ADD FOREIGN KEY (CustomerAddress)
- Purpose: You're adding a foreign key constraint on the
CustomerAddress
column in theOrders
table.
- What is a Foreign Key?
- A foreign key is a column (or a set of columns) in one table that uniquely identifies a row in another table.
- It creates a link between two tables, ensuring data integrity.
- In this case:
- The
CustomerAddress
in theOrders
table will reference theCustomerAddress
in theAddresses
table.
3. REFERENCES Addresses(CustomerAddress)
- Purpose: This specifies which table and column the foreign key references.
- Explanation:
Addresses
is the table being referenced.(CustomerAddress)
specifies the column in theAddresses
table that the foreign key relates to.
- Result:
- This establishes a relationship between the
Orders
table and theAddresses
table based on theCustomerAddress
column.
Why Add a Foreign Key Constraint?
1. Enforce Referential Integrity
- Referential Integrity: Ensures that relationships between tables remain consistent.
- How It Works:
- The database will prevent you from inserting a
CustomerAddress
into theOrders
table if it doesn't exist in theAddresses
table. - It will also prevent you from deleting a
CustomerAddress
from theAddresses
table if it's being used in theOrders
table (unless you handle deletions specifically).
2. Data Consistency and Accuracy
- Avoid Orphan Records:
- Without the foreign key constraint, you might have orders that reference addresses that don't exist, leading to data inconsistency.
- Maintain Accurate Relationships:
- Ensures that every order is associated with a valid address.
3. Simplify Database Maintenance
- Easier Updates:
- When an address changes, you can update it in one place (
Addresses
table), and it will be reflected wherever it's used.
- Prevent Errors:
- Reduces the risk of typos or incorrect data entry in the
Orders
table for addresses.
Detailed Example
Let's go through an example to illustrate how this works.
Before Adding the Foreign Key Constraint
- Orders Table:
OrderID | CustomerName | CustomerAddress |
101 | Alice | 123 Maple St |
102 | Bob | 456 Oak Ave |
103 | Charlie | 789 Pine Rd |
104 | Eve | 999 Unknown Blvd |
- Addresses Table:
CustomerAddress | Region |
123 Maple St | North |
456 Oak Ave | East |
789 Pine Rd | North |
- Issue:
- The
Orders
table contains an address (999 Unknown Blvd
) that doesn't exist in theAddresses
table. - This leads to a data inconsistency because we have an order linked to a non-existent address.
After Adding the Foreign Key Constraint
- Attempt to Insert Invalid Data:
- Result:
- The database will reject this insertion and throw an error because
'999 Unknown Blvd'
doesn't exist in theAddresses
table. - This maintains data integrity by ensuring only valid addresses are used.
Understanding Each Component Further
ALTER TABLE
- General Use: Modify the structure of an existing table.
- Syntax:
ALTER TABLE table_name ...
ADD FOREIGN KEY
- Purpose: Add a foreign key constraint to a table.
- Syntax:
ADD FOREIGN KEY (column_name) REFERENCES other_table(other_column)
REFERENCES
- Purpose: Specify the table and column that the foreign key is referencing.
- Syntax:
REFERENCES referenced_table(referenced_column)
What Happens Internally When You Add a Foreign Key Constraint
1. Validation of Existing Data
- The database checks all existing
CustomerAddress
entries in theOrders
table to ensure they match entries in theAddresses
table.
- Possible Outcomes:
- All Addresses Match: The constraint is added successfully.
- Some Addresses Don't Match: The database will throw an error, and the constraint won't be added until the data is corrected.
2. Enforcing the Constraint for Future Operations
- Insertions:
- Any new order must have a
CustomerAddress
that exists inAddresses
.
- Updates:
- Changing the
CustomerAddress
in an existing order must result in a value that exists inAddresses
.
- Deletions:
- Attempting to delete a
CustomerAddress
fromAddresses
that's used inOrders
will be blocked unless you specify cascading behavior.
Handling Deletions and Updates
Specifying Actions on DELETE and UPDATE
When adding a foreign key, you can specify what should happen when the referenced data is deleted or updated.
Syntax with ON DELETE and ON UPDATE
Options:
- ON DELETE CASCADE:
- If a
CustomerAddress
is deleted fromAddresses
, all related records inOrders
will also be deleted.
- ON DELETE SET NULL:
- If a
CustomerAddress
is deleted, theCustomerAddress
field inOrders
will be set toNULL
.
- ON UPDATE CASCADE:
- If a
CustomerAddress
inAddresses
is updated, the change will automatically reflect in theOrders
table.
Example:
- Deleting an Address:
- With ON DELETE CASCADE:
- This action will delete all orders in
Orders
whereCustomerAddress
is'123 Maple St'
.
- Without Specifying Behavior:
- The database will prevent the deletion and throw an error because there are orders referencing that address.
Why Use CustomerAddress Instead of an ID?
Alternate Approach: Using AddressID
- Common Practice:
- Typically, a unique identifier like
AddressID
(an integer) is used as the primary key. - The
Orders
table would then referenceAddressID
instead ofCustomerAddress
.
Advantages of Using AddressID:
- Performance:
- Integer keys are faster to join on compared to strings.
- Flexibility:
- Allows the
CustomerAddress
fields to change without affecting the foreign key relationships.
Why Use CustomerAddress Here?
- Simplicity:
- If
CustomerAddress
is guaranteed to be unique and not change, it can serve as a primary key.
- Learning Purpose:
- For demonstration, using
CustomerAddress
directly may simplify understanding the relationship.
Ensuring Data Types Match
- The data type of
CustomerAddress
in both theOrders
andAddresses
tables must be the same or compatible.
- Example:
- If
CustomerAddress
is defined asVARCHAR(100)
inAddresses
, it should also beVARCHAR(100)
inOrders
.
Possible Errors and How to Fix Them
Error When Adding Constraint
- Error Message:
- "Cannot add or update a child row: a foreign key constraint fails..."
- Cause:
- There are entries in
Orders
whereCustomerAddress
doesn't exist inAddresses
.
- Solution:
- Identify and correct the invalid
CustomerAddress
entries in theOrders
table. - Or, add the missing addresses to the
Addresses
table.
Error When Deleting Referenced Data
- Error Message:
- "Cannot delete or update a parent row: a foreign key constraint fails..."
- Cause:
- You're trying to delete an address that's still being referenced in
Orders
.
- Solution:
- Decide whether to cascade the delete, set the foreign key to null, or prevent the deletion.
Checking the Constraint
Verify Constraint Exists
- MySQL Command:
- Look for:
- A line that defines the foreign key constraint, indicating it's been successfully added.
Summary of Key Points
- Foreign Key Constraint:
- Ensures that every
CustomerAddress
inOrders
exists inAddresses
. - Maintains referential integrity between the two tables.
- ALTER TABLE Statement:
- Modifies the existing
Orders
table to include the foreign key.
- REFERENCES Clause:
- Specifies the table and column that the foreign key points to.
- Data Integrity:
- Prevents invalid data entry and maintains consistent relationships.
Final Thoughts
- Importance in Database Design:
- Using foreign key constraints is essential for relational databases to ensure data integrity.
- Best Practices:
- Always ensure that the data in the child table (
Orders
) corresponds to data in the parent table (Addresses
) before adding a foreign key constraint.
- Future Operations:
- Be mindful of how deletions and updates in the parent table affect the child table.
- Consider specifying
ON DELETE
andON UPDATE
behaviors according to your application's needs.
Additional Resources
- Books and Tutorials:
- "SQL for Dummies" – A beginner-friendly guide to SQL.
- "Database Systems: The Complete Book" by Garcia-Molina, Ullman, and Widom – Offers in-depth explanations of database concepts.
- Online Courses:
- Coursera and edX offer free courses on databases and SQL.
- Practice:
- Use online SQL editors or install a local database (like MySQL or PostgreSQL) to practice creating tables and adding constraints.
总结
通过上述步骤,我们使用 MySQL 代码演示了如何将数据库表从不规范的状态逐步规范化到第三范式(3NF):
- 1NF:消除了非原子值,将重复的数据拆分,使每个单元格只包含一个值。
- 2NF:消除了部分依赖,将非完全依赖于主键的属性分离到新的表中。
- 3NF:消除了传递依赖,将非键属性间的依赖关系分离,确保所有非键属性只直接依赖于主键。
这样,数据库结构更加合理,减少了数据冗余,增强了数据的一致性和完整性。
How to use the database created above:
You've structured your database by creating and normalizing several tables to efficiently store and manage order-related data. Here's how you can use these tables:
1. Understanding the Tables
a.
Order_Initial
- Purpose: This is your initial unnormalized table containing orders with multiple products listed in a single field.
- Structure:
OrderID
(Primary Key)CustomerName
Products
(Comma-separated list of products)
b.
Order_1NF
- Purpose: This table brings the data into First Normal Form (1NF) by ensuring each field contains only atomic values (one value per field).
- Structure:
OrderID
CustomerName
Product
(Each product is on a separate row)- Composite Primary Key: (
OrderID
,Product
) CustomerAddress
(Added later)
c.
Orders
- Purpose: Stores basic order information and customer details, further normalizing the data.
- Structure:
OrderID
(Primary Key)CustomerName
CustomerAddress
- Foreign Key:
CustomerAddress
referencesAddresses(CustomerAddress)
d.
OrderDetails
- Purpose: Contains the details of each order's products, linking back to the
Orders
table.
- Structure:
OrderID
Product
- Primary Key: (
OrderID
,Product
) - Foreign Key:
OrderID
referencesOrders(OrderID)
e.
Addresses
- Purpose: Holds customer address information and their associated regions, achieving Third Normal Form (3NF) by removing transitive dependencies.
- Structure:
CustomerAddress
(Primary Key)Region
2. How to Use the Tables
a. Inserting Data
- Adding a New Order:
- Insert customer details into
Addresses
(if the address is new). - Insert the order into
Orders
. - Insert products into
OrderDetails
.
b. Querying Data
- Retrieve All Orders with Customer and Product Details:
Hints
This SQL query retrieves a list of orders along with the customer names and the products associated with each order by joining two tables: Orders and OrderDetails.
Detailed Explanation:
- SELECT Clause:
- o.OrderID: Retrieves the
OrderID
from theOrders
table. - o.CustomerName: Retrieves the
CustomerName
from theOrders
table. - od.Product: Retrieves the
Product
from theOrderDetails
table.
- FROM Clause:
- Specifies the
Orders
table as the primary table to select data from. - The table
Orders
is aliased aso
for shorthand reference.
- JOIN Clause:
- Performs an INNER JOIN between the
Orders
table (o
) and theOrderDetails
table (od
). - The join condition
ON o.OrderID = od.OrderID
ensures that only rows with matchingOrderID
in both tables are combined. - The table
OrderDetails
is aliased asod
.
Purpose of the Query:
- To combine order information with corresponding product details.
- It matches each order with its products based on the
OrderID
.
- Useful for generating reports that show which customers ordered which products.
Visual Representation:
Assuming we have the following data:
Orders Table (
o
):OrderID | CustomerName |
1 | Alice |
2 | Bob |
3 | Charlie |
OrderDetails Table (
od
):OrderID | Product |
1 | Laptop |
1 | Mouse |
2 | Keyboard |
3 | Monitor |
3 | HDMI Cable |
3 | USB Hub |
Result of the Query:
OrderID | CustomerName | Product |
1 | Alice | Laptop |
1 | Alice | Mouse |
2 | Bob | Keyboard |
3 | Charlie | Monitor |
3 | Charlie | HDMI Cable |
3 | Charlie | USB Hub |
Key Points:
- INNER JOIN: Only includes rows where there is a match in both tables.
- Aliases (
o
andod
): Simplify references to table names, especially when dealing with multiple tables.
- Join Condition (
o.OrderID = od.OrderID
): Ensures data is accurately combined based on theOrderID
.
Why Use This Query:
- To generate a comprehensive list of orders with customer and product details.
- Ideal for analyzing sales data, customer purchasing behavior, or preparing detailed invoices.
Additional Notes:
- If you want to include orders without matching order details, consider using a LEFT JOIN instead.
- Ensure that the columns used in the join condition (
OrderID
in this case) are properly indexed for optimal performance.
- Get Customer Addresses and Regions:
- Find All Products Ordered by a Specific Customer:
c. Updating Data
- Update a Customer's Address:
- Change a Product in an Order:
d. Deleting Data
- Delete an Order:
- Remove entries from
OrderDetails
due to foreign key constraints. - Delete the order from
Orders
.
- Delete a Customer Address:
- Ensure no orders are linked to that address before deletion.
e. Maintaining Data Integrity
- Foreign Key Constraints: The foreign keys ensure that:
- Every
OrderID
inOrderDetails
must exist inOrders
. - Every
CustomerAddress
inOrders
must exist inAddresses
.
- Preventing Orphan Records: Always delete or update child records (
OrderDetails
) before modifying parent records (Orders
andAddresses
).
3. Understanding Normalization Steps
First Normal Form (1NF)
- Goal: Eliminate repeating groups and ensure each field contains only atomic values.
- Action: Split the
Products
field into separate rows inOrder_1NF
.
Second Normal Form (2NF)
- Goal: Remove partial dependencies; ensure that non-key attributes are fully functionally dependent on the primary key.
- Action: In
Order_1NF
,CustomerAddress
depends only onCustomerName
, leading to potential redundancy.
Third Normal Form (3NF)
- Goal: Eliminate transitive dependencies; non-key attributes should not depend on other non-key attributes.
- Action: Move
Region
fromOrders
to a newAddresses
table to remove the dependency ofRegion
onCustomerAddress
.
4. Best Practices
- Use Transactions: When performing multiple related operations, use transactions to maintain data integrity.
- Backup Data: Regularly back up your database to prevent data loss.
- Indexing: Consider adding indexes on frequently queried fields to improve performance.
- Validation: Implement checks and validation to ensure data consistency (e.g., data types, value ranges).
5. Sample Queries for Practice
- List All Orders with Their Regions:
- Find Customers in the 'North' Region:
- Count the Number of Products per Order:
By understanding the structure and relationships of your normalized tables, you can efficiently store, retrieve, and manage your order data while ensuring data integrity and minimizing redundancy.
- 作者:现代数学启蒙
- 链接:https://www.math1234567.com/normalisationprocess
- 声明:本文采用 CC BY-NC-SA 4.0 许可协议,转载请注明出处。
相关文章