Share
Explore

The Theory of SQL

Dr. E.F. Codd and the foundations of relational theory. A more rigorous explanation.
Here's an attempt to present the teachings in a manner consistent with Codd's original vision:
How to Construct a Relational Database for Your Company: Dr. E.F. Codd's Guidelines
Understand Your Business Model:
Before delving into database design, a comprehensive understanding of the business domain – encompassing all essential data and processes – is paramount.
Identify Distinct Subjects within the Business Model:
Recognize and label each principal subject or topic. For each of these subjects or entities, you will form relational tables.
Dissect Each Entity:
For every entity discerned, enumerate all its properties or characteristics. These attributes will metamorphose into columns (fields) within your tables.
Comprehend the Structure of a Relational Table:
Tables in a relational database comprise tuples or rows. Every tuple represents a single instance or record from your business domain. A crucial component of each tuple is the Primary Key, a unique identifier ensuring each tuple's distinctness. This primary key maintains a one-to-one relationship with every attribute in the tuple. The unequivocal and exclusive nature of these primary keys in the table embodies the First Normal Form (1NF) in relational theory.
Navigate Relationships with Precision:
Relations among entities in the business domain can range from one-to-one, one-to-many, to many-to-many. Particularly with many-to-many relationships, it's imperative to use what's termed a "junction" or "relational" table. For example:
A student can enroll in numerous classes.
Conversely, a class may accommodate multiple students.
To represent this, one might employ a relational table to record each instance of a student enrolling in a class, ensuring both entities maintain their structural integrity.
By adhering to these principles, you can harness the power of the relational model to craft efficient, robust, and scalable databases tailored to your company's needs.

The mathematical precepts of the Relational Database model in terms of set theory, group theory, and relational algebra and relational calculus.


The underpinnings of the relational database model are rooted in mathematics.
Let's take a journey through the foundational concepts.

1. Set Theory:

In set theory, a set is a collection of distinct elements. In the realm of relational databases:
A relation (or table) is essentially a set of tuples (rows). Since it's a set, it means there should be no duplicate rows in a table.
Operations like union (UNION), difference (MINUS or EXCEPT), and Cartesian product (JOIN) are borrowed from set theory.

2. Group Theory:

Though group theory isn't as directly visible in SQL or database design, it plays a role in understanding structures and operations:
The group theory concepts of closure are evident in SQL operations. For instance, the union of two relations (with the same attributes) results in another relation.
arrow

Group theory in relational database theory.

Group theory and the relational database model are both rooted in mathematical logic and structures, but they serve different domains of application. Group theory is a branch of abstract algebra dealing with algebraic structures known as "groups." Meanwhile, the relational database model is based on set theory, relational algebra, and relational calculus. While group theory isn't a primary pillar of relational databases, it offers some useful analogies and contexts.
In the context of relational databases and group theory:
Closure Property: Just as an algebraic group must close over an operation (e.g., the result of any two numbers in the group added together must also be in the group), operations in a relational database, like union or join, must produce valid relations. If you join two tables or union two datasets, you will receive another table or dataset as a result.
Identity Element: In group theory, for any element in a group, there's an identity element relative to the associated operation (e.g., 0 for addition in number groups). In databases, certain operations like the Cartesian product can be likened to this concept. An empty relation can sometimes play the role of an identity relation in certain contexts.
Inverse Element: In groups, every element has an inverse, such that the operation between an element and its inverse yields the identity element. In relational databases, certain operations have inverses. For example, the difference operation can be seen as an inverse to union under certain conditions.
Associative Property: In group theory, the grouped way in which operations are performed doesn't matter (a + (b + c) = (a + b) + c). Analogously, in relational algebra, some operations like join and union are associative.
Binary Operation: Just as groups are defined concerning a particular binary operation, relational databases often involve binary operations, like joins, that combine two relations to produce a third.
However, it's essential to recognize that these are analogies and conceptual bridges. The relational model does not explicitly derive from group theory. Instead, it's fundamentally based on set theory and logic. The associations with group theory can provide an enriched perspective and further understanding of the mathematical foundation but are not core to the functioning or design of relational databases.

3. Relational Algebra:

Relational algebra is a procedural query language involving a set of operations that take one or two relations as input and produce a new relation as output. The fundamental operations include:
Select (σ): Choose rows/tuples from a relation based on a predicate.
Project (π): Choose specified columns from a relation.
Union (⋃): Combine two relations with the same set of attributes.
Set Difference (-): Tuples present in one relation but not in another.
Cartesian Product (x): Combine tuples from two relations.
Rename (ρ): Rename the attributes of a relation.
Join (⨝): Combines tuples based on a certain condition.

4. Relational Calculus:

Relational calculus is a non-procedural query language that uses mathematical predicate calculus.
There are two types of relational calculus:
Tuple Relational Calculus (TRC): Filters tuples using a specific condition or predicate. Query has the form:
{ T | P(T) }
where T is a tuple variable and P(T) is a predicate.
Domain Relational Calculus (DRC): Expresses queries in terms of attributes (columns). The form is similar to:
{ a1, a2, ... an | P(a1, a2, ... an) }
where a1, a2, ... an are attribute names and P is a predicate.
The relational database model leverages a combination of mathematical concepts to provide a solid framework for organizing and querying data.
Dr. Codd's genius was in recognizing and integrating these concepts into a unified model that's both robust and efficient.
Understanding these mathematical foundations can deepen one's grasp of why relational databases operate as they do.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.