E.2 Entity-Relationship Modeling

Entity-relationship (ER) modeling is a simple and clear method of expressing the design of database. ER modeling isn't new?it was first proposed by Chen in 1976?but it has emerged as the dominant modeling technique only in the past 15 years.

Figure E-2 shows a partial model of the winestore. In this diagram, you can see the relationship between a wine, a winery, and a wine-growing region. Each wine has attributes such as a wine_name, year, and a description. A wine is made by a winery, and each winery has a winery_name. Many wineries are located in a region, where a region has a region_name.

Figure E-2. A simple ER model showing the relationship between wines, wineries, and regions

ER diagrams aren't complicated, and we have already illustrated most of the features of ER modeling in Figure E-2. These features include:

Rectangles: Represent entities, the objects being modeled. Each entity is labeled.
Diamonds: Represent relationships between entities; a relationship is labeled with a descriptive title that explains how the entities interact.
Ellipses: Represent attributes that describe an entity. Underlined attributes are primary keys that uniquely identify instances of the entity.
Lines between rectangles and diamonds: Connect entities to relationships. Lines may be annotated, which means that the two ends can be marked with an M and an N, an M and a 1, an N and a 1, or a 1 and a 1. Annotations indicate the cardinality of the relationship, discussed later in this section.
Lines between rectangles and ellipses: Connect attributes to entities. These lines are never labeled.

Other ER modeling tools include double ellipses, dashed ellipses, and double lines; we use some of these advanced features later in this appendix.

E.2.1 Case Study: Modeling the Online Winestore

To illustrate how ER modeling can be used to effectively design a database, we return to our online winestore.

E.2.1.1 Identifying entities in ER modeling

The first step in developing a web database application is to consider the requirements of the system. The requirements for the online winestore are described in Chapter 16 and are typically gathered from a scope document, customer interviews, user requirements documents, and so on. Having identified the general requirements of the system, the first phase in the ER modeling process is to identify the entities from those requirements.

Entities are objects or things that can be described by their characteristics. As you identify entities, you list the attributes that describe the entity. For example, a customer is an entity that has a name, an address, a phone, and other details.

Be careful when choosing entities. A customer or a wine is an entity. Reducing the stock in the inventory and adding it to a shopping cart is a function or process, not an entity. The basic rule is that an entity is an object or thing.

Five entities and their attributes have already been identified previously in this appendix. Four are easy to determine from the winestore requirements:

The wine entity has the attributes wine_name, year, and description. Wines also have a type (such as red or white), but we'll return to this later.
The customer entity has the attributes surname, firstname, initial, address, city, state, zipcode, phone, and birth_date. Customers also have a title (such as Mr. or Mrs.) and reside in a country, but we'll return to this later.
The winery entity has the attribute winery_name.
The region entity has the attribute region_name.

We add a users entity to maintain user account details at the winestore:

The users entity has the attributes user_name and password. The user_name is the customer's email address.

The remaining entities?and, in two cases, the distinction between the entities?are harder to identify. This is where experience, and trial and error come into play.

We have earlier identified the orders entity in our introduction to ER modeling, but an order is hard to precisely define. One description might be:

An order is an object created by a customer when they agree to purchase one or more (possibly different) bottles of wine.

We can then say that orders are created on a date, and the system requirements in Chapter 16 identify that an order has an associated creditcard, card expirydate, and delivery instructions. The credit card details aren't associated with a customer because we want to know with which credit card each order was purchased.

We can also say that this model of orders consists of one or more different wines and, for each different wine, a quantity of that wine is purchased. The subparts in each order?the different wines?are the items that make up the order. But is the wine itself part of an item? The distinction is hard, but the correct answer is probably no: this is a relationship, in which the items that make up orders are related to wines.

There are now two more entities?orders and items?and two relationships, which illustrates how difficult it is to reason about entities without considering how they are related. Determining entities isn't always easy, and many different drafts of an ER model are often required before a final, correct model is achieved. The ER model for the winestore took us several attempts to get right.

Here are the items and orders entities:

The items entity?which is related to orders?has the attributes qty (quantity) and price. In turn, the items entity is related to the wine entity.
The orders entity has attributes date, instructions, creditcard, and expirydate.

The system requirements in Chapter 16 also describe how wine is delivered to the store in shipments. Each shipment is on a particular date and consists of a number of bottles at a particular price. How is this to be incorporated into the model? Perhaps the most obvious solution is to add quantity and price attributes to the wine entity.

Adding a quantity and price to the wine entity doesn't work well. Shipments arrive at different times and the price of a wine can change over time. We therefore need to maintain different prices for different shipments so that we can charge the user the correct price for the bottles in that shipment. In Australia, for example, it's illegal to sell an old shipment at a higher price than you've previously advertised when a new shipment arrives.

A good solution to the inventory problem is an inventory entity. This entity is related to the wine, and maintains different data for each shipment of each wine:

The inventory entity has an on_hand quantity, a per bottle cost, and a date_added.

The last major entity is somewhat of an oddity. If a wine is a Cabernet Merlot, you can simply store the string Cabernet Merlot in an attribute in the wine entity. Another approach is to have a grape_variety entity, where each different grape variety is described individually and you can combine them to create wine blends. So, for example, Cabernet is one instance of a grape_variety entity, and Merlot is another. The grape_variety entity is then related to the wine entity. This approach does seem overly complicated, but let's opt for it anyway because it introduces an instructive twist to our modeling, a many-to-many relationship discussed in the next section.

Here's the grape_variety entity:

The grape_variety entity has the attribute variety (which is a grape type such as Merlot).

There are other possible entities. For example, the shopping basket could be an entity: the shopping cart is an object that contains items that will later be ordered. However, in our application we've built the shopping cart by using the orders entity in a different way and adding some logic in our code to distinguish between completed orders and shopping carts. Including a shopping cart as an entity would perhaps be a valid choice, and depends on how the entities are interpreted from the requirements.

There are also other entities that are outside the scope of our requirements. For example, a county or state might contain many regions, but there is no requirement for these to be modeled in our system. Also, the winestore itself is an entity, but we are actually interested in the entities that make up the winestore, not the whole concept itself. Selecting entities is all about getting the granularity and scope of choice right.

Another common type of entity are lookup tables that store lists of commonly used values. For example, a lookup table that stores Zip Codes and city names is a common entity in many designs. In the winestore application, there's four possible lookup tables: a countries lookup for customers, a titles lookup (for values such as Dr., Mr., Miss, Mrs., or Ms.) for customers, a wine_type lookup (such as red or white) for wines, and the grape_variety entity that we've previously discussed. Let's define these as new entities:

The countries entity has country names
The wine_type entity has wine_type values
The titles entity has customer title values

It's a good idea to include lookup tables as entities. The alternative is to define them as attributes of another table; for example, country could be an attribute of customer. However, if they're defined as attributes of another entity, this doesn't force the user to pick from a list of possible values, leading to possible inconsistencies between values (such as Australia, australia, aust, AUS, and so on) unless you add complex validation to your PHP script. Inconsistency in values makes it difficult to write reports and run queries: for example, it'd be hard to figure out how many customers live in Australia if it's represented in several different ways.

We have hinted at but not explicitly identified the relationships between the entities. For example, a winery is part of a region, a wine is made by a winery, and an item is related to a wine. The first step is to identify the entities and their attributes; the second step is to identify how the entities are related.

E.2.1.2 Identifying relationships in ER modeling

Before identifying the relationships between entities, let's explore the possible types of relationship or cardinalities that can exist. Cardinality refers to the three possible relationships between two entities:^[1]

^[1] Actually, relationships can exist between as many entities as there are in the model. We have deliberately omitted the distinction with relationships that are optional, that is, where one instance of an entity?such as a customer?can exist without a related entity, such as an order. We avoid complex relationships in this appendix; more detail can be found in the books listed in Appendix G

One-to-one: A one-to-one relationship is represented by a line labeled with a 1 at each end that joins two entities. One-to-one means that for the two entities connected by the line, there is exactly one instance of the first entity for each one instance of the second entity. An example might be customers and user details: each customer has exactly one set of user details (a username and a password), and those user details are for only that customer.
One-to-many (or many-to-one): A one-to-many relationship is represented by a line annotated with a 1 and an M (or a 1 and an N). One-to-many means that for the two entities connected by the line, there are one or more instances of the second entity for each one instance of the first entity. From the perspective of the second entity, any instance of the second entity is related to only one instance of the first entity. An example is wineries and wines: each winery sells many wines, but each wine is made by exactly one winery. Many-to-one relationships are the most common relationships between entities.
Many-to-many: A many-to-many relationship is represented by a line annotated with an M and an N. Many-to-many means that for the two entities connected by the line, each instance of the first entity is related to one or more instances of the second entity and, from the other perspective, each instance of the second entity is related to one or more instances of the first entity. An example is the relationship between wineries and delivery firms: a winery may use many delivery firms to freight wine to customers, while a delivery firm may work for many different wineries.

It isn't surprising that many database modelers make mistakes with cardinalities. Determining the cardinalities of the relationships between the entities is the most difficult skill in ER modeling, but one that, when performed correctly, results in a well-designed database. To illustrate how cardinality is determined, let's consider the relationships between the entities in the winestore and present arguments for their cardinalities.

Correctly assigning cardinalities is essential. Mistakes in cardinalities of relationships lead to duplicated data, inconsistencies, and redundancy in the database. All lead to poor performance and a hard-to-maintain database.

E.2.1.3 Relationships in the winestore ER model

Before considering cardinalities, you need to consider what entities are related. You know from previous discussion that a region is related to a winery, and that a winery is related to a wine. There are other relationships that are implicitly identified: orders contains items, a customer places orders, users have customer details, and a wine has an inventory. Also, titles and countries are lookups related to the customer entity, and the wine_type lookup is related to wine.

There is also one crucial relationship that links the wine sold to the customer, that is, the relationship between an items and the inventory. Last, a wine contains one or more different grape_variety entities.

To assign cardinalities, let's start with the relationship of wine to winery. To begin, you need to decide what sort of relationship these entities have and assign a descriptive term. A good description of the relationship between wine and winery is that a winery makes wine. Now draw a diamond labeled makes between the entities wine and winery, and connect the relationship to the two entities with an unannotated line. This process is shown in Figure E-3 (A).

The next step is to determine what cardinality to assign to this relationship. The most effective approach to determining cardinality is to consider the relationship from the perspective of both entities. From the perspective of a winery, the question to ask is: Does a winery make exactly one wine or one or more wine objects? The answer is the latter, so you write M at the wine end of the relationship. From the other perspective?that of the wine?you can ask a second simple question: Is a wine made by exactly one or more than one winery? This answer is the former?that limitation is noted in the system requirements?and you can write a 1 at the winery end of the relationship. The annotated, one-to-many relationship is shown in Figure E-3 (B).

Figure E-3. A partial ER model showing the relationship between wines and wineries

Dealing with the relationship between winery and region involves similar arguments. You begin by describing the relationship. In this case, an appropriate label might be that a winery is situated in a region. After drawing the diamond and labeling it, now consider the cardinalities. A winery belongs in exactly one region, so label the region end with a 1. A region can contains more than one winery, so you label the winery end with an M.

There are six more relationships that can be completed using the same one-to-many arguments:

The consists-of relationship between orders and items
The purchase relationship between customer and orders
The stocked relationship between wine and inventory
The classed-as relationship between wine and the wine_type lookup
The lives-in relationship between customer and the countries lookup
The titled relationship between customer and the titles lookup

You can label all six with a 1 and an M (or N). These relationships are shown as part of Figure E-4.

Figure E-4. An almost-complete ER model for the winestore

You know that the users and customer have a one-to-one relationship. So, let's draw a line between the two entities and label it with a 1 at each end. Label the relationship as logs-in.

The final two relationships are a more difficult to identify and annotate.

The first is the relationship between an order's items and a wine. The one-to-many cardinality isn't a difficult proposition, but determining that this relationship actually exists is harder. When considering what makes up orders, there are two possibilities: items can be related to a specific inventory entry, or items can be related to a wine. The former is possibly more intuitive because the items that are delivered are bottles from our inventory. However, the latter works better when modeling the system's data requirements. So, let's settle on items being related to wine and label the relationship sells.

The second difficult?and final?relationship is that between wine and grape_variety. Naming the relationship is easy: let's call this relationship blend. Determining the cardinality is harder. First, consider the relationship from the wine perspective. A wine can contain more than one grape variety when it is a blend, so you label the grape variety end of the relationship with an M. Now consider the relationship from the grape variety perspective. A grape variety, such as semillon, may be in many different wines. So, let's settle on a many-to-many relationship and label the wine end with an N.

Our ER model is almost complete, and Figure E-4 shows it with all its entities and relationships. What remains is to consider the key attributes in each of the entities, which are discussed in the next section. As you consider these, you can adjust the types of relationships slightly.

Before we move on, let's summarize what we've done so far. There are a few rules that determine what relationships, entities, and attributes are, and what cardinalities should be used:

Expect to draft a model several times.
Begin modeling with entities, add attributes, and then determine relationships.
Include an entity only when it can be described with attributes that are needed in the model.
Some entities can be modeled as attributes. For example, a state can be an entity, but it might be better modeled as one of the attributes that is part of an customer.
Avoid unnecessary relationships. Create only those relationships that are needed in the system.
One-to-one relationships are uncommon. If two entities participate in a one-to-one relationship, check that they aren't actually the same entity.
Many-to-many relationships are complex. Use one-to-many relationships in preference where possible.

E.2.1.4 Identifying key attributes in ER modeling

In our introduction to ER modeling, we noted some of the implicit constraints of our model, including that there is only one customer #37 and one wine that we refer to as #168. However, in the modeling so far, we haven't considered how to uniquely identify each entity with a primary key.

Uniqueness is an important constraint. When a customer places an order, you must be able to uniquely identify that customer and associate the unique order with that unique customer. You also need to be able to uniquely identify the wine that a customer purchases. In fact, all entities must be uniquely identifiable, and this is true for all relational databases.

So, the next step in ER modeling is to identify the attributes or sets of attributes that uniquely identify an entity. Let's begin with the customer. A surname (or any combination of names) doesn't uniquely identify a customer. A surname, firstname, initial, and a complete address may work, although there are some cases where children and parents share the same name and address.

A less complicated approach for unique identification?and a common one that's guaranteed to work?is to add an identifier number (ID) attribute to the entity. Identifiers are used in all aspects of life?not just databases?and include such things as phone, passport, and street numbers, and Zip codes and IP addresses. A short unique identifier also leads to better database performance than using several string attributes, as discussed in Chapter 15.

Using this approach, we can create a cust_id attribute and assign ID #1 to the first customer, ID #2 to the second customer, and so on. In the modeling process, this new attribute is underlined to indicate that it uniquely identifies the customer, as shown later in Figure E-5.

You can take the same approach with wine and add an ID attribute, wine_id. For winery and region, the name attribute is most likely unique or, at least, it can be made so. However, for simplicity, you can still use the ID attribute approach to prevent any ambiguity or need for the winestore administrator to carefully choose unique names. The same argument can be applied to the grape_variety, titles, wine_type, and countries entities.

The orders entity can also be dealt with by a unique ID, as can items and inventory. However, there are two ways to tackle the problem. First, you can uniquely number each of the orders across the whole database, beginning with the application's first order_id #1. Alternatively, you can begin each of the customer's orders with order_id #1. The combination of cust_id and order_id is still unique: for example, cust_id #37, order_id #1 is different from cust_id #15, order_id #1.

An advantage of combining a customer and order number is that the number is more meaningful to the user. For example, a user can tell from the combined number how many orders they've placed, and the order_id provides a convenient counting tool for reporting. We use this approach in the winestore application.

Because orders can't exist without a customer, they're weak entities that are involved in a full participation relationship. Full participation means that orders aren't possible without a related customer and, because the cust_id forms part of the orders entity's unique identifier, orders are a weak entity. Participation is discussed briefly in the next section and weak entities are discussed in more detail later in this chapter.

You can follow the same argument about unique identification for items. The items can be uniquely numbered across the whole database or can be numbered from #1 within an order. Again, we follow the latter approach. The same applies for inventory, which is numbered within a wine since there are potentially many different shipments of each wine.

The only entity remaining is users. The user_name attribute must be unique, so we can choose it to uniquely identify the rows.

E.2.1.5 Other ER modeling tools

Other ER modeling tools include double ellipses, double lines, and double rectangles. These tools permit the representation of other constraints, multivalued attributes, and the specification of full participation. In addition, it's possible for a relationship to have an attribute, that is, for a diamond to have attributes that are part of the relationship, not part of the entities. Useful references for more advanced ER modeling?and enhanced ER (EER) modeling?are provided in Appendix G.

Double lines between a relationship diamond and an entity indicate full participation and represent cases where an instance of one entity can't exist without a corresponding instance of the entity that it is related to. An example is the orders entity in the winestore model. An instance of orders can't exist without a customer to make that order. Therefore, the relationship between orders and customer should be represented as a double line. The same constraints apply in the model to items and inventory.

Double rectangles represent weak entities. A weak entity isn't uniquely identifiable without including the key of the entity it's related to. For example, in the previous section, we explained how orders are uniquely identified by a combination of the customer ID and the orders ID. Without a customer, an order isn't uniquely identifiable and so it's a weak entity. The same applies to items and inventory.

Dashed ellipses represent multivalued attributes, attributes that may contain more than one instance. For example, the attribute address can be multivalued, because there could be a business address, a postal address, and a home address. Multivalued attributes aren't used in our model.

In addition, there are other extensions to the modeling techniques that have already been applied. For example, more than two entities can be related in a relationship (that is, more than two entities can be connected to a diamond). For example, the sale of a wine can be described as a three-way relationship between a wine, a customer, and orders. A second complex technique is the composite attribute; for example, an attribute of customer is address and the attribute address has its own attributes, a street, city, and zipcode. We don't explore complex relationships in this book.

E.2.1.6 Completing the ER model

Figure E-5 shows the final ER model with the unique key constraints shown. Notice that for items, orders, and inventory, the attributes from other entities aren't included. They are instead indicated as weak entities with a double rectangle and they participate fully in the related entities as indicated by double lines.

Figure E-5. The complete ER model for the winestore database

If items, orders, and inventory were numbered across the whole system, you could omit the double rectangles. The double lines can be omitted if any entity can exist without the related entity.

A summary of ER notation tools is shown in Figure E-6.

Figure E-6. Tools used in ER modeling

E.2.2 Converting an Entity-Relationship Model to SQL

There are five steps to convert an ER model to SQL CREATE TABLE statements.

E.2.2.1 Step 1: Convert regular entities to tables

The first step is the simplest. Here's what you do:

For each non-weak entity in the ER model, write out a CREATE TABLE statement with the same name as the entity.
Include all attributes of the entity and assign appropriate types to the attributes.
Include as table attributes all of the ER model attributes that uniquely identify the entity and add the NOT NULL modifier to them. Include a PRIMARY KEY clause that lists the attributes.

To perform this step, you need to make decisions about attribute types in the SQL CREATE TABLE statements. Attribute types are discussed in Chapter 5.

There are several non-weak entities in the model. Let's begin with the region entity, which has the attributes region_id and region_name. You might anticipate no more than 100 different regions, but let's be cautious in case more than 1,000 regions need to be stored. So, let's decide that region_id is an int (integer) type and that it has a width of 4 digits. Let's assume that a region name can be at most 100 characters in length and define region_name as a varchar(100).

As decided earlier in the appendix, the unique key of the region table is an ID, which is now called region_id. Accordingly, you define a PRIMARY KEY of region_id. A requirement of all primary keys is that they are specified as NOT NULL, and this is added to the attribute.

The resulting definition for the region table is as follows:

CREATE TABLE region (

  region_id int(4) NOT NULL,

  region_name varchar(100) NOT NULL,

  PRIMARY KEY (region_id),

  KEY region (region_name)

) type=MyISAM;

Notice an additional KEY on the region_name named region. By adding this key, we're anticipating that a common query is a search by region_name. Also, a region must have a name, so a NOT NULL is added to the region_name attribute. Last, we've added a table type of MyISAM to the end of the definition; table types are discussed in Chapter 15.

The CREATE TABLE statements for the other non-weak entities are listed next. Remember, however, that this is only the first step: some of these CREATE TABLE statements are altered by the processes in later steps. Here they are:

CREATE TABLE countries (

  country_id int(4) NOT NULL,

  country char(30) NOT NULL,

  PRIMARY KEY (country_id),

  KEY (country)

) type=MyISAM;



CREATE TABLE customer (

  cust_id int(5) NOT NULL,

  surname varchar(50),

  firstname varchar(50),

  initial char(1),

  address varchar(50),

  city varchar(50),

  state varchar(20),

  zipcode varchar(10),

  phone varchar(15),

  birth_date char(10),

  PRIMARY KEY (cust_id)

) type=MyISAM;



CREATE TABLE grape_variety (

  variety_id int(3) NOT NULL,

  variety varchar(50) DEFAULT '' NOT NULL,

  PRIMARY KEY (variety_id),

  KEY var (variety)

) type=MyISAM;



CREATE TABLE titles (

  title_id int(2) NOT NULL,

  title char(10),

  PRIMARY KEY (title_id)

) type=MyISAM;



CREATE TABLE users (

  user_name varchar(50) NOT NULL,

  password varchar(32) NOT NULL,

  PRIMARY KEY (user_name),

  KEY password (password)

) type=MyISAM;



CREATE TABLE wine (

  wine_id int(5) NOT NULL,

  wine_name varchar(50) NOT NULL,

  year int(4) NOT NULL,

  description blob,

  PRIMARY KEY (wine_id),

  KEY name (wine_name),

) type=MyISAM;



CREATE TABLE winery (

  winery_id int(4) NOT NULL,

  winery_name varchar(100) NOT NULL,

  PRIMARY KEY (winery_id),

  KEY name (winery_name),

)  type=MyISAM;

E.2.2.2 Step 2: Convert weak entities to tables

The second step is almost identical to the first but is used for weak entities. Here's what you do:

For each weak entity in the model?there are three: inventory, orders, and items?translate the entity directly to a CREATE TABLE statement as in Step 1.
Include all attributes as in Step 1.
Include as attributes the primary key attributes of the owning entity, the entity the weak entity is related to. Then, include these attributes as part of the primary key of the weak entity.

For example, for the inventory entity, create the following:

CREATE TABLE inventory (

  wine_id int(5) NOT NULL,

  inventory_id int(3) NOT NULL,

  on_hand int(5) NOT NULL,

  cost decimal(5,2) NOT NULL,

  date_added date,

  PRIMARY KEY (wine_id,inventory_id)

) type=MyISAM;

The wine_id is included from the wine table and forms part of the PRIMARY KEY definition. However, just because we've done this, it doesn't mean that our SQL queries can automatically discover the relationship between wine and inventory or that the numbering in the identifiers will be synchronized. You need to maintain the relationships using SQL queries that specify the join attributes in WHERE clauses. This is discussed in Chapter 5.

Note that all attributes can't be NULL in the inventory table, so NOT NULL is used liberally.

A similar approach is taken with orders, in which cust_id is included from the customer table as an attribute and as part of the PRIMARY KEY definition:

CREATE TABLE orders (

  cust_id int(5) NOT NULL,

  order_id int(5) NOT NULL,

  date timestamp(12),

  instructions varchar(128),

  creditcard char(16),

  expirydate char(5),

  PRIMARY KEY (cust_id,order_id)

)  type=MyISAM;

The items table is slightly more complex, but made easier because orders has already been defined. The items table includes the PRIMARY KEY attributes of the entity it is related to (that is, orders). Because the PRIMARY KEY of orders is already resolved, the items table is as follows:

CREATE TABLE items (

  cust_id int(5) NOT NULL,

  order_id int(5) NOT NULL,

  item_id int(3) NOT NULL,

  qty int(3),

  price decimal(5,2),

  PRIMARY KEY (cust_id,order_id,item_id)

) type=MyISAM;

E.2.2.3 Step 3: One-to-one relationships

If two entities have a one-to-one relationship, check that your modeling is correct. If the entities totally participate in each other (where an instance of either entity can't exist without a matching instance in the other entity) and neither participates in another relationship, consider removing one of the tables and merging the attributes into a single table.

If you can't remove one of the entities, follow this process for conversion:

Choose one of the two tables that's involved in the relationship (this table has already been identified and written out as part of Steps 1 or 2). If the relationship involves total participation, choose the entity that totally participates.
In the chosen table, include as an attribute (or attributes) the primary key of the other table.

There is a one-to-one relationship between customer and users in our model. The rule in the first step isn't a constraint, so we arbitrarily choose the users table. Then, we add the identifier cust_id from customer to it:

CREATE TABLE users (

  cust_id int(5) NOT NULL,

  user_name varchar(50) NOT NULL,

  password varchar(32) NOT NULL,

  PRIMARY KEY (user_name),

  KEY password (password),

  KEY cust_id (cust_id)

) type=MyISAM;

E.2.2.4 Step 4: One-to-many relationships

For a one-to-many relationship, here's the procedure:

Identify the table representing the many (M or N) side of the relationship.
Add to the many-side (M or N) table the primary key of the 1-side table.
Optionally, add NOT NULL to any attributes added.

In the model, this means adding a winery_id and a wine_type identifier to the wine table:

CREATE TABLE wine (

  wine_id int(5) NOT NULL,

  wine_name varchar(50) NOT NULL,

  wine_type int(2) NOT NULL,

  year int(4) NOT NULL,

  winery_id int(4) NOT NULL,

  description blob,

  PRIMARY KEY (wine_id),

  KEY name (wine_name),

  KEY winery (winery_id)

) type=MyISAM;

For the winery table, it means adding a region_id:

CREATE TABLE winery (

  winery_id int(4) NOT NULL,

  winery_name varchar(100) NOT NULL,

  region_id int(4) NOT NULL,

  PRIMARY KEY (winery_id),

  KEY name (winery_name),

  KEY region (region_id)

)  type=MyISAM;

For the items table, it means adding a wine_id:

CREATE TABLE items (

  cust_id int(5) NOT NULL,

  order_id int(5) NOT NULL,

  item_id int(3) NOT NULL,

  wine_id int(4) NOT NULL,

  qty int(3),

  price decimal(5,2),

  PRIMARY KEY (cust_id,order_id,item_id)

) type=MyISAM;

For the customer table, it means adding a country_id and a title_id:

CREATE TABLE customer (

  cust_id int(5) NOT NULL,

  surname varchar(50),

  firstname varchar(50),

  initial char(1),

  title_id int(3),

  address varchar(50),

  city varchar(50),

  state varchar(20),

  zipcode varchar(10),

  country_id int(4),

  phone varchar(15),

  birth_date char(10),

  PRIMARY KEY (cust_id)

) type=MyISAM;

In cases where you wish to prevent a row being inserted without a corresponding value in the related table, you can add a NOT NULL to the newly added attribute.

E.2.2.5 Step 5: Many-to-many relationships

For many-to-many relationships?there is one in our model between wine and wine_variety?the following procedure is used:

Create a new table with a composite name made of the two entities that are related.
Add the primary keys of the two related entities to this new table.
Add an ID attribute if the order of relationship is important. For example, in the winestore, a Cabernet Merlot Shiraz is different from a Shiraz Merlot Cabernet, so an ID is required.
Define the primary key of the new table to be the primary keys of the two related entities.

For the winestore, this leads to creating the following table:

CREATE TABLE wine_variety (

  wine_id int(5) NOT NULL,

  variety_id int(3) NOT NULL,

  id int(1) NOT NULL,

  PRIMARY KEY (wine_id,variety_id),

) type=MyISAM;

The table contains the primary keys of wine and grape_variety and defines these as the PRIMARY KEY. No change is required to the wine or grape_variety tables.

Our conversion of the model to SQL is now complete, and the database is ready to be created and loaded with data.