We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

"Normalize until it hurts; denormalize until it works"

Formal Metadata

Title
"Normalize until it hurts; denormalize until it works"
Title of Series
Number of Parts
50
Author
Contributors
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
There’s a good practice that says “a database is a representer of facts”. If there’s more than one way to extract a single fact from the database, then there’s a redundancy in it. Every redundancy can cause different anomalies in the data, which in turn cause bugs in the application. To avoid that, there’s a process called normalization, which involves following sets of rules to restructure the database to remove redundancies without losing the original facts. The traditional set of normalization rules are the so-called Normal Forms: First Normal Form, Second, Third, etc. Unfortunately, those are frequently overlooked by developers due to their excessive formalism. But in fact, even the Normal Forms aren’t enough to avoid anomalies, since they’re concerned about redundancies only in a single table*. Since cross-table dependencies are very common in modern applications, we must go beyond normal forms to prevent problems. In this talk, we’ll present normalization rules on a friendly language, going beyond normal forms. We’ll understand how the software requirements cause dependencies in database tables, both in-table and cross-tables. We’ll show real examples of non-trivial dependencies that happen on Django models. We’ll discuss how normalization prevents redundancies, inconsistencies, anomalies, and bugs. Knowing that normalization can cause slowdowns in queries, we’ll present how to increase performance with denormalization, which is not the same of not normalizing. Instead, denormalization means being able to represent data in multiple ways to speed up queries without introducing inconsistencies. We’ll discuss Django-related denormalization tools that use cronjobs, indexes, caching, materialized views and triggers, and NoSQL. *It’s common to ignore the fact that normal forms only discuss redundancies inside a single table/record/relval. More about this in this article reviewed by Codd, Fagin and Date, key figures of the relational model.