<- Virtual Exhibitions in Informatics

Database Abstraction: Aggregation and Generalization

In 1977 the paper “Database Abstractions: Aggregation and Generalization” is published in ACM Transaction on Database Systems, Vol. 2, No. 2, June 1977, written by John Miles Smith and Diane C.P. Smith. Until the publication of this paper database research has been almost concerned with aggregation, for example Codd’s normal form, while generalization has been largely ignored.

In this paper a structuring discipline for generalization abstraction and the integration with aggregation abstraction in database design, especially in Codd’s relational model, is developed. Aggregation and generalization are combined into one structuring discipline.

The benefits of such a structuring discipline are stability of data models, easier understanding of complex models, a more systematic approach to database design and the support of highly structured models without loss in intellectual manageability.

Generalization is very essential in designing a database to model the real world because it enables users to employ established thought patterns in their interaction with the database.

An abstraction is a model of a system in which certain details are deliberately omitted. The model can be decomposed into a hierarchy of abstractions. A relation in Codd’s relational schema supports two distinct forms of abstraction – aggregation and generalization.

Aggregation and generalization are fundamentally important in database design. Aggregation and generalization helps to reduce complexity in modeling.

Aggregation refers to an abstraction in which a relationship between objects is regarded as a higher-level object. Generalization refers to an abstraction in which a set of similar objects is regarded as a generic object.

So when an appropriate structuring discipline is imposed, Codd’s relational schema can support both hierarchies of aggregation abstraction and generalization abstraction at the same time.

Generalization in Codd’s relational model

A method for representation a generic hierarchy as Codd relations is to create for each generic object in the hierarchy one relation.



Figure 1. A generic hierarchy.

Codd relations for the three generic objects from the hierarchy of Fig. 1.

Person:

identification

First-Name

Last-Name

Category

01

John

Meyer

Student

02

Peter

Kidman

Professor

Student:

identification

First-Name

Last-Name

Semester

01

John

Meyer

6

Professor:

identification

First-Name

Last-Name

Salary

02

Peter

Kidman

100.000

Modeling with the generic structure

The generic structure is described in this paper as a structuring primitive to specify generalizations in relational models. The generic structure simultaneously specifies two abstractions. It specifies a relation as an aggregation of a relationship between different objects and as a generalization of a class containing different objects.

The graphical notation for the generic structure:



Reference

Smith, John Miles / Smith, Diane C.P.: Database Abstractions: Aggregation and Generalization, ACM Transactions on Database Systems, Vol. 2, No. 2, June 1977