
Functional Dependencies, Armstrong's Rule, & Normalization
What are Functional Dependencies?
Functional dependencies in database management refer to relationships between attributes in a table, where the value of one more attribute determines the value of another attribute(s).
Process of Functional Dependencies
Identification
Identity dependencies between attributes based on how their values relate to each other.
Representation
Represent dependencies using functional dependency notation (X -> Y), where X determines Y.
Normalization
Utilize functional dependencies to normalize the database schema, reducing redundancy and improving data integrity.
Example
- Consider a table "Employees" with attributes (EmployeeID, Name, DepartmentID, DepartmentName).
- Here, DepartmentName is functionally dependent on DepartmentID, as each DepartmentID uniquely determines the DepartmentName.
- The functional dependency can be represented as DepartmentID -> Department Name.
Importance
Data Integrity
Ensures data consistency and accuracy by avoiding inconsistent or redundant data entries.
Database Design
Guides the design process, helping to organize data efficiently and minimize data redundancy.
Normalization
Forms the basis for normalization techniques like 1NF, 2NF, 3NF, etc., ensuring databases adhere to standard forms.
Limitations
- Complexity: Managing functional dependencies can become complex in large databases with numerous attributes and relationships.
- Performance Impact: Excessive normalization based on functional dependencies can impact query performance due to increased join operations.
Armstrong's Inference Rule
- Armstrong's inference rules are a set of rules used in database normalization and dependencies analysis.
- These rules help identify functional dependencies between attributes, aiding in the normalization process to eliminate data redundancy and improve data integrity.
- Armstrong's inference rules are based on logical implications and can be expressed in the form of implications, guiding the analysis of functional dependencies in a database schema.
Rules
Reflexivity (Reflexive Rule)
- If Y is a subset of X, then X -> Y holds.
- Example: If {A,B} -> {A}, then {A,B} -> {A,B} holds.
Augmentation (Augmentation Rule)
- If X -> Y holds, the XZ -> XZ holds for any Z.
- Example: if {A} -> {B}, then {A,C} -> {B,C} holds.
Transitivity (Transitive Rule)
- If X-> Y and Y -> Z holds, then X -> Y holds.
- Example: If {A} -> {B} and {B} -> {C}, then {A} -> {C} holds.
Union (Union Rule)
- If X -> Y and X-> Z hold, then X -> YZ holds.
- Example: If {A} -> {B} and {A} -> {C}, then {A} -> (B,C} holds.
Decomposition (Decomposition Rule)
- If X -> YZ holds, then X -> Y and X -> Z hold.
- Example: If {A} -> {B,C}, then {A} -> {B} and {A} -> {C} hold.
Importance of Armstrong's Inference Rule
Dependency Analysis
Armstrong's rules are fundamental in analyzing functional dependencies and identifying key relationships in a database schema.
Normalization
These rules guide the normalization process, ensuring databases adhere to higher normal forms like 2NF, 3NF, BCNF, etc.
Data Integrity
By reducing data redundancy and anomalies, applying Armstrong's rules improves data integrity and consistency.
Data Normalization in DMBS
- Normalization is the process of organizing data in a database to eliminate redundancy and dependency anomalies.
- It involves dividing large tables into smaller, more manageable ones and defining relationships between them to achieve data integrity and efficiency.
- Reduce data redundancy: Store each piece of information in one place to avoid repetition.
- Minimize data modification anomalies: Ensure that updates or deletions in data do not cause inconsistencies.
- Improve data integrity: Maintain accurate and reliable data across the database.
- Enhance database structure: Organize data logically to improve query performance and maintainability.
Normal Forms
- First Normal Form (1NF)
- Second Normal Form (2NF)
- Third Normal Form (3NF)
- Boyce-Codd Normal Form (BCNF)
First Normal Form (1NF)
- Each table has a primary key.
- All attributes are atomic (indivisible)
- Example: Consider a table "Students" with columns (StudentID, Name, Courses).
- Break "Courses" into a separate table "StudentCourses" with columns (StudentID, Course).
Second Normal Form (2NF)
- Must be in 1NF.
- Eliminate partial dependencies by moving attributes that depend on part of the primary key to separate tables.
- Example: In "Student Courses" if (StudentID, Course) is the composite primary key and "Course: depends only on "Course", create a "Courses" table with columns (Course, CourseName).
Third Normal Form (3NF)
- Must be in 2NF.
- Eliminate transitive dependencies by moving non-key attributes that depend on the other non-key attributes to separate tables.
- Example: If "Course" depends on "Department", create a "Departments" table with columns (DepartmentID, DepartmentName) and link it to the "Courses" table.
Boyce-Codd Normal Form (BCNF)
- Must be in 3NF.
- Every determinant is a candidate key.
- Example: If there are functional dependencies like StudentID -> Name and StudentID, Course -> Grade, ensure StudentID is the candidate key, and Course is functionally dependent on StudentID.
Example
Step 1: Create the Initial Tables
-- Create Employees table
CREATE TABLE Employees (
EmpID INT PRIMARY KEY,
Name VARCHAR(100) NOT NULL,
DeptID INT,
DeptName VARCHAR(100)
);
-- Create Departments table
CREATE TABLE Departments (
DeptID INT PRIMARY KEY,
DeptName VARCHAR(100),
ManagerID INT,
ManagerName VARCHAR(100)
);
Step 2: Normalize the Employees Table
-- Create a new table for Employee Details
CREATE TABLE EmployeeDetails (
EmpID INT PRIMARY KEY,
DeptName VARCHAR(100),
FOREIGN KEY (EmpID) REFERENCES Employees(EmpID)
);
-- Drop redundant column DeptName from Employees table
ALTER TABLE Employees DROP COLUMN DeptName;
Step 3: Normalize the Departments Table
-- Create a new table for Managers
CREATE TABLE Managers (
ManagerID INT PRIMARY KEY,
ManagerName VARCHAR(100),
FOREIGN KEY (ManagerID) REFERENCES Employees(EmpID)
);
-- Drop redundant columns ManagerID and ManagerName from Departments table
ALTER TABLE Departments DROP COLUMN ManagerID;
ALTER TABLE Departments DROP COLUMN ManagerName;
This normalization process ensures that data is organized efficiently, eliminating redundancy and maintaining data integrity.
Conclusion
Now we have a basic understanding of Data Normalization: Functional dependencies, Armstrong's inference rule, & Normalization (up to BCNF).