17
Sun, Nov
2 New Articles

Relational Database Design -- Part 2

Change Management
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

This month's installment covers the logical design of a sample database. This database will be a very simple one, as it is intended to be a training tool and not a database for a finished system.

What is a database? The term database, as I am using it in this article, means a collection of one or more files that solves a particular problem. Unfortunately, most programmers who learned file design on a flat file system tend to design a file at a time. With the availability of relational database technology, it is important to start designing databases instead of just files. A database is composed of files, but it is much more than that. It's a structured repository that captures information concerning the entities of an organization. This repository is composed of files, the relationships between these files, and integrity rules for the information that can be stored.

The Scope

The first step in designing our sample database is to define the scope of our database; that is, the information our database will contain. The easiest way to do this is to write a simple statement of the purpose of our database:

The editorial department needs to keep accurate and timely information concerning the articles that are currently being considered for publication in an issue of the magazine. The information concerning the author of the article must also be kept.

This general statement gives us some boundaries to begin working in. Although this step may seem to be so elementary that you may be tempted not to do it -- do it anyway. Database design is no different than any other structured design methodology; it's a top-down design approach. Start at a very broad level and then gradually decompose the problem until a detailed solution is reached.

Entity Relationships

The next step is to establish the entities that will make up the database, and define their relationships. This will further refine the scope of our database. The easiest way to implement this step is to use entity- relationship (E-R) diagrams.

An E-R diagram shows the main groupings of data as boxes and a simplified version of the relationships between these groups as diamonds. 1 shows a sample E-R diagram for our sample database. In plain English, the chart says that an AUTHOR WRITES an ARTICLE that is then SCHEDULED to be published in an ISSUE.

An E-R diagram shows the main groupings of data as boxes and a simplified version of the relationships between these groups as diamonds. Figure 1 shows a sample E-R diagram for our sample database. In plain English, the chart says that an AUTHOR WRITES an ARTICLE that is then SCHEDULED to be published in an ISSUE.

The E-R diagram offers many advantages. The diagram can be understood by both technical and non-technical personnel. This allows user involvement to ensure that important details haven't been forgotten. The E-R diagram does not depend upon any particular system for the physical implementation of the design. As a result, this design technique may be used for any database management system, including a flat file system.

After the E-R diagram is complete, the definition of the scope of the database is complete. As the design progresses, other entities may arise that were not included in the original diagram and you should make sure that the diagram is updated since this diagram is an important tool for the maintenance of the database. The diagram provides a quick and easy overview of what is going on in the database. Keep in mind that sometime in the life of the database, you or someone else will be required to modify the structure of the database.

Defining the Objects

The next step is to define the objects that will make up the database. These objects are the entities (boxes) from the E-R diagram. The definition of the object is a complete statement of the purpose of the entity it represents.

This specification includes a description of its' purpose, properties, necessary data, and the rules that define the integrity of the object. The object specifications for the AUTHOR, ARTICLE and ISSUE objects are shown in 2. At this step, these only include the purpose and properties of the objects. Before we can define the data of an object and the integrity rules, we must first define the data dictionary.

This specification includes a description of its' purpose, properties, necessary data, and the rules that define the integrity of the object. The object specifications for the AUTHOR, ARTICLE and ISSUE objects are shown in Figure 2. At this step, these only include the purpose and properties of the objects. Before we can define the data of an object and the integrity rules, we must first define the data dictionary.

At this stage of the design process, the main objective is to further refine the scope of our database. The definition of the object at this stage should give the overall purpose of the information that the object will hold and the properties (types of information) that the object will contain. In 2 the AUTHOR object's purpose is to hold information about an individual who writes an article for publication in the magazine. The properties of the object (the type of information contained in the object) are where to contact the author, technical skill level of the author and the writing experience of the author. This information will be used as the input to the next step, the data dictionary.

At this stage of the design process, the main objective is to further refine the scope of our database. The definition of the object at this stage should give the overall purpose of the information that the object will hold and the properties (types of information) that the object will contain. In Figure 2 the AUTHOR object's purpose is to hold information about an individual who writes an article for publication in the magazine. The properties of the object (the type of information contained in the object) are where to contact the author, technical skill level of the author and the writing experience of the author. This information will be used as the input to the next step, the data dictionary.

Data Dictionary

A data dictionary is a definition of all of the individual pieces of information that a database must contain in order for it to satisfy its purpose. It's important that data elements be defined in their own right and not as part of a particular object. This ensures that each element can stand on its own.

A data dictionary should, at the least, define the following for each element.

* Name of the data element -- this name should not be limited by the restrictions of the method of implementation (i.e., six-character names in RPG).

* A description that fully describes the purpose of the data element. The type of data, such as text, numeric or date.

* What makes the contents of the data element valid. For example, amount can't exceed $200.

3 shows an example of some of the elements of the data dictionary for our sample database. At this point the data elements are not part of any particular object, they are separate entities. This is very important to remember when defining the criteria for a valid field. The validity of a field in the data dictionary can't depend on any particular object, such as employee number must exist in the employee master file. The data elements defined in the data dictionary are generic fields, they may be used in more than one object. This ensures that every time a data element is used in the database, it will be consistent with all other occurrences in the database.

Figure 3 shows an example of some of the elements of the data dictionary for our sample database. At this point the data elements are not part of any particular object, they are separate entities. This is very important to remember when defining the criteria for a valid field. The validity of a field in the data dictionary can't depend on any particular object, such as employee number must exist in the employee master file. The data elements defined in the data dictionary are generic fields, they may be used in more than one object. This ensures that every time a data element is used in the database, it will be consistent with all other occurrences in the database.

There are a few things to remember when defining a data dictionary. Each item must be atomic. This means that each element must not be able to be broken down any further. For example, instead of defining an address data element, you should define street, city, state and zip data elements. Data elements can always be combined as needed.

How atomic is atomic? In some situations, an area code will be included with the phone number, in others it may be a separate data element depending upon the application(s) for which the database is intended. As a general rule, it is better to break an element down as far as possible.

A data element must not be a multiple-occurring element. It can't be an array or table. For example, if you need to keep track of monthly balances, don't define an array of 12 items -- instead define a month data element and a balance element. The main reason for this is that it will not allow a database to be in third normal form.

Integrity Rules

An integrity rule is a definition of the relationship between a data element and its contents; a data element and another data element; a data element and an object; or between two or more objects. 4 shows some of the integrity rules for our sample database.

An integrity rule is a definition of the relationship between a data element and its contents; a data element and another data element; a data element and an object; or between two or more objects. Figure 4 shows some of the integrity rules for our sample database.

Integrity rules should cover all possible situations. Be thorough, make sure you have all the bases covered. Each rule must include a definition that is made up of a totally clear set of conditions that must be met before the rule can be satisfied. Don't make the mistake of saying "Oh, well that's just understood." These integrity rules are the basis for an accurate, flexible database.

The E-R diagram gives us our first set of integrity rules. The relationships (diamonds) are integrity rules. At a broad level, these relationships define how, and in what manner, objects interact with each other.

Samples of other integrity rules are as follows; an hourly wage can't be less than $5.00 or more than $50.00; a salesman must be employed by the company at the time he makes a sale; for an order to be placed, the customer must be a current customer; an employee's termination date can't be prior to hire date.

Putting It All Together

It is now time to take the data elements we described in the data dictionary and assign each of them to an object. This is where database normalization comes in. 5 lists the different levels of database normalization and their definitions. In this sample database, we will only concern ourselves with the third normal form. This is because when using existing database technology, it's usually not possible or practical to go beyond this level.

It is now time to take the data elements we described in the data dictionary and assign each of them to an object. This is where database normalization comes in. Figure 5 lists the different levels of database normalization and their definitions. In this sample database, we will only concern ourselves with the third normal form. This is because when using existing database technology, it's usually not possible or practical to go beyond this level.

The first step is to assign each of the data elements to an object. The purpose and properties sections of the object design should give you a clear indication of where a data element belongs. For example, the author name is assigned to the AUTHOR object, not the ARTICLE object. This is because the main purpose of the AUTHOR object, not the ARTICLE object, is to track specifics concerning an author. The author ID is what will be placed in the ARTICLE object.

You will notice that the author ID is assigned to two objects. This is done to establish a relationship between two or more objects. The data element must be a primary key in one and only one of the objects and a foreign key in all of the others. A primary key is a data element that uniquely identifies one of the members of an object; e.g., an employee number. A foreign key is a data element that is the primary key of an object and is used by other objects as a method of linking one object to the other.

Now that we have assigned the data elements to the objects we must choose one or more data elements to act as the primary. These data elements must uniquely identify a row in this object. This restriction is a requirement for a normalized relational database. There may be more than one set of data elements that will uniquely identify a row, these are known as candidate keys. One of these candidate keys must be chosen as the primary key, usually the most concise one.

The final thing to do is to assign any of the integrity rules that concern the object to that object. There will be some integrity rules that don't belong to any one particular object, they apply to the entire database. Make sure that these integrity rules are kept along with the object definitions.

6 shows the final object definitions. These will be used when it comes to the physical design and implementation of the database.

Figure 6 shows the final object definitions. These will be used when it comes to the physical design and implementation of the database.

Advantages of the Logical Design Phase

The process just described offers the following benefits:

* You can involve the users in the design of the database. Since everything produced in this design phase is easily understood by nontechnical personnel, they can proof your design and then you can make any corrections before the physical design phase begins. It's the user's knowledge of the day-to-day workings of the organization that drives good database design.

* The design is based on the what not the how. This delays the real (or imagined) limitations of a database management system until the physical implementation. While you may find that while certain methods of physical implementation have severe limitations, with some ingenuity there are always alternatives.

* Flaws in the database become apparent before too much time and money are spent on the implementation of a poorly designed database.

* Future maintenance is made much easier.

The Physical Design

Next month, we will cover the physical design and implementation of the database we just designed. To show just how flexible you can be when you design before you code, I will show the implementation of the design using three different methods: DDS, IDDU and SQL.


Relational Database Design -- Part 2

Figure 1 Entity-Relationship Diagram (unable to reproduce)


Relational Database Design -- Part 2

Figure 2 Object specifications

 Figure 2: Object Specifications Object Name: AUTHOR Purpose: An author is an individual that writes an article for possible publication in the magazine. Properties: Contact information Technical competence Writing competence Object Name: ARTICLE Purpose: An article is a story, program or any piece of information that is submitted for publication in the magazine. This can come from inside and/or outside sources. Properties: Description of article Status of article Object Name: ISSUE Purpose: An issue is a monthly edition of the magazine. Properties: Editorial information 
Relational Database Design -- Part 2

Figure 3 Data dictionary

 Figure 3: Data Dictionary Data Element: Author's name Description: Legal name of the author of an article. Data Type: Text, 25 Integrity rule: Can not be blank. Data Element: Home Street Description: Street portion of the home address of an author. Data Type: Text, 25 Integrity rule: Can not be blank. Data Element: Work phone number Description: Work phone number of an author. Data Type: Numeric, 10.0 Integrity rule: Can not be less than or equal to zero. Must include the area code. Data Element: Years of technical experience Description: The total number of years experience that an author has .had on a midrange system. Data Type: Numeric, 2.0 Integrity rule: Can not be less than or equal to zero. Data Element: Issue date Description: The date of an issue of the magazine. Data Type: Date Integrity rule: Can not be blank. Data Element: Article status Description: The current status of an article. Data Type: Text, 2 Integrity rule: Can not be blank. Must be AC (accepted), RJ (rejected) or SC (scheduled). 
Relational Database Design -- Part 2

Figure 4 Integrity rules

 Figure 4: Integrity Rules Name: ARTICLE-ISSUE Objects: ARTICLE, ISSUE Rules: An issue must exist before an article can be scheduled for it. Name: ARTICLE-AUTHOR Objects: ARTICLE, AUTHOR Rules: An author must exist before an article can be entered for him. Name: Author of an article Objects: ARTICLE Rules: All articles must have an author. 
Relational Database Design -- Part 2

Figure 5 Normal forms

 Figure 5: Normal Forms First Normal Form -- 1NF * Each column is atomic. This means that it can't be subdivided any further without its primary purpose being changed. * The same columns occur in each row. In other words, no multiple record type files. * Contains no multiple-occurrence columns -- no tables, arrays or pseudo- arrays (Month 1, Month 2, etc.). Second Normal Form -- 2NF * Every column that is not part of the primary key, must be functionally dependent on the entire primary key. Third Normal Form -- 3NF * There can't be any functional dependencies from a non-key column to any other column. All functional dependencies must be between non-key data items and a candidate key. Boyce -Codd Normal Form * No part of a key may depend on any non-key column. Fourth Normal Form -- 4NF * There can be only one multi-valued dependency in a relation unless all columns involved int the dependency can be combined to functionally determine some other column. Fifth Normal Form - 5NF * The relation cannot be recreated by joining two or more of its projections, each having a different primary key. 
Relational Database Design -- Part 2

Figure 6 Completed object specifications

 Figure 6: Completed Object Specifications Object Name: AUTHOR Purpose: An author is an individual that writes an article for possible publication in the magazine. Properties: Contact information Technical competence Writing competence Data Elements: Author ID Name Home street Home city Home state Home ZIP Home phone Work street Work city Work state Work ZIP Work phone Work extension Fax number Years of technical experience Main system Writing skill level Integrity Rules: Object Name: ARTICLE Purpose: An article is a story, program or any piece of information that is submitted for publication in the magazine. This can come from inside and/or outside sources. Properties: Description of article Status of article Data Elements: Article ID Article name Author ID Article description Article status Date in-house Issue date Integrity Rules: Author ID must already exist in AUTHOR. If article status is scheduled the issue date must exist in ISSUE. Object Name: ISSUE Purpose: An issue is a monthly edition of the magazine. Properties: Editorial information Data Elements: Issue date Theme Buyer's Guide Editorial due date Printer due date Integrity Rules: 
BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: