According to Chapple (2000), Normalization is “the process of efficiently organizing data in a database.” He explains that the two goals of normalization are “eliminating redundant data (for example, storing the same data in more than one table)” and “ensuring data dependencies make sense (only storing related data in a table).” For experienced database designers, these concepts are obvious and commonplace, but nevertheless important.
The Normal Forms is a series of guidelines produced to direct the creation of normalized databases. There are five normal form levels, from 1NF – 5NF. Chapple notes that many systems adhere to levels 1 – 3 with an occasional 4NF. However he says the “fifth normal form is very rarely seen.”
Chapple identifies each of the levels as follows.
First normal form (1NF) sets the very basic rules for an organized database:
- Eliminate duplicative columns from the same table.
- Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).
Second normal form (2NF) further addresses the concept of removing duplicative data:
- Meet all the requirements of the first normal form.
- Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
- Create relationships between these new tables and their predecessors through the use of foreign keys.
Third normal form (3NF) goes one large step further:
- Meet all the requirements of the second normal form.
- Remove columns that are not dependent upon the primary key.
Finally, fourth normal form (4NF) has one additional requirement:
- Meet all the requirements of the third normal form.
- A relation is in 4NF if it has no multi-valued dependencies.
Chapple (2000). Database Normalization Basics. Retrieved May 12, 2009, from http://databases.about.com/od/specificproducts/a/normalization.htm