Skip to content

Advanced Data Modeling

Advanced Data Modeling

Note
  • Learning Objectives
    • Describe the main extended entity relationship (EER) model constructs and how they are represented in ERDs and EERDs
    • Use entity clusters to represent multiple entities and relationships in an entity relationship diagram (ERD)
    • Describe the characteristics of good primary keys and how to select them
    • Apply flexible solutions for special data-modeling cases

Extended Entity Relationship Model (EERM)

  • Result of adding more semantic constructs to the original entity relationship (ER) model
  • EER diagrams (EERDs) use the EER model

Entity Supertypes and Subtypes

  • Entity supertype
    • Generic entity type related to one or more entity subtypes
    • Contains common characteristics
  • Entity subtype
    • Contains unique characteristics of each entity subtype
    • Criteria to determine usage
  • There must be different, identifiable kinds of the entity in the user’s environment
    • The different kinds of instances should each have one or more attributes that are unique to that kind of instance

Specialization Hierarchy

  • Entity supertypes and subtypes are organized in a specialization hierarchy
    • Depicts arrangement of higher-level entity supertypes and lower-level entity subtypes
    • Relationships are described in terms of “is-a” relationships
    • Subtype exists within the context of a supertype
    • Every subtype has one supertype to which it is directly related
    • Supertype can have many subtypes
  • A specialization hierarchy provides the means to:
    • Support attribute inheritance
    • Define a special supertype attribute known as the subtype discriminator
    • Define disjoint or overlapping constraints and complete or partial constraints

Specialization Hierarchy

Inheritance

  • Enables an entity subtype to inherit attributes and relationships of the supertype
    • All entity subtypes inherit their primary key attribute from their supertype
    • At the implementation level, supertype and its subtype(s) maintain a 1:1 relationship
    • Entity subtypes inherit all relationships in which supertype entity participates
    • Lower-level subtypes inherit all attributes and relationships from its upper-level supertypes

Inheritance

Subtype Discriminator

  • Attribute in the supertype entity that determines to which entity subtype the supertype occurrence is related
    • Default comparison condition is the equality comparison
    • In some situations the subtype discriminator is not necessarily based on an equality comparison

Disjoint and Overlapping Constraints

  • Disjoint subtypes: contain a unique subset of the supertype entity set
    • Known as nonoverlapping subtypes
    • Implementation is based on the value of the subtype discriminator attribute in the supertype
  • Overlapping subtypes: contain nonunique subsets of the supertype entity set
    • Implementation requires the use of one discriminator attribute for each subtype

Disjoint and Overlapping Constraints

Completeness Constraint

  • Specifies whether each supertype occurrence must also be a member of at least one subtype
    • Partial completeness: not every supertype occurrence is a member of a subtype
    • Total completeness: every supertype occurrence must be a member of at least one subtypes

Specialization Hierarchy Constraint Scenarios

Specialization and Generalization (AKA Top-Down and Bottom-Up Development)

  • Specialization
    • Top-down process
    • Identifies lower-level, more specific entity subtypes from a higher-level entity supertype
    • Based on grouping unique characteristics and relationships of the subtypes
  • Generalization
    • Bottom-up process
    • Identifies a higher-level, more generic entity supertype from lower-level entity subtypes
    • Based on grouping common characteristics and relationships of the subtypes

Entity Clustering

  • Virtual entity type used to represent multiple entities and relationships in ERD
    • Formed by combining multiple interrelated entities into a single, abstract entity object
    • General rule: avoid the display of attributes to eliminate complications that result when the inheritance rules change

Entity Clusters

Selecting Primary Keys

  • Primary keys: single attribute or a combination of attributes
    • Uniquely identifies each entity instance
    • Guarantees entity integrity
    • Works with foreign keys to implement relationships
Natural Keys and Primary Keys
  • Natural key or natural identifier: real-world identifier used to uniquely identify real-world objects
    • Familiar to end users and forms part of their day-to-day business vocabulary
    • Used as the primary key of the entity being modeled
Primary Key Guidelines
  • Desirable primary key characteristics
    • Non intelligent
    • No change over time
    • Preferably single-attribute
    • Preferably numeric
    • Security-compliant
When to Use Composite Primary Keys
  • Identifiers of composite entities (i.e. things like Invoice/Invoice Line)
    • Each primary key combination is allowed once in M:N relationship
  • Identifiers of weak entities (i.e. Dependent because it doesn’t exist on its own)
    • Strong identifying relationship with the parent entity
    • Represents a real-world object that is existence-dependent on another real-world object
    • Represented in the data model as two separate entities in a strong identifying relationship

Good use of Composite PK

When to Use Surrogate Primary Keys
  • Primary key used to simplify the identification of entity instances
    • Useful when there is no natural key
    • Helpful if selected candidate key has embedded semantic contents or is too long
  • Require ensuring that the candidate key of entity in question performs properly
    • Use “unique index” and “not null” constraints

Implementing 1:1 Relationships

  • Foreign keys work with primary keys to properly implement relationships in relational model
    • Place primary key of the parent entity on the dependent entity as foreign key
  • Options for selecting and placing the foreign key
    • Place a foreign key in both entities
    • Place a foreign key in one of the entities

Foreign Key Selection in 1:1

Maintaining History of Time-Variant Data

  • Time-variant data: data whose values change over time and for which a history of the data changes must be retained
    • Requires creating a new entity in a 1:M relationship with the original entity
    • New entity contains the new value, date of the change, and any other pertinent attribute

Time Variant Salary History Example Time Variant Manager History Example Time Variant Job History Example

Fan Traps

  • Design trap: occurs when a relationship is improperly or incompletely identified
    • Represented in a way not consistent with the real world
  • Fan trap: occurs when one entity is in two 1:M relationships to other entities
    • Produces an association among other entities not expressed in the model

Redundant Relationships

  • Occur when there are multiple relationship paths between related entities
    • Must remain consistent across the model
    • Help simplify the design