How Relational Databases Work
As a news site, you might want to keep a database of past stories for your readers. You could keep that information in a table that looks like this:
Table name: Articles
Headline | Byline | Date | Body |
Dog Bites | Dave Barry | Dec 12, 2005 | Body of the story |
Record Cold | Travis Smith | Dec 12, 2005 | Body of the story |
Mets Win | Travis Smith | Dec 12, 2005 | Body of the story |
This Articles table contains all the basic information you need. The “Headline,” “Byline” and “Body” fields are text fields, and the “Date” is, of course, a date field. Because of the way database programs read information, names for database fields should be a single word with no spaces. If you need to give a field a name with more than one word, like “Final Score,” either run the words together (“FinalScore”) or replace the space with an underscore (“Final_Score”).
Even though you want to keep a table small, you would want to add a field to the Articles table named “ArticleID” that assigns each article a unique identification number. There are two reasons for this:
- You might want to have two articles on the same date with the same title and the same byline. For example, “Correction,” “Web staff” and “June 1.”
- It’s simpler to write a program that just looks for entry number 6 from the Articles table than it is to request an entry from Articles with the headline “Dog Bites,” the byline “Travis Smith” and the date “Dec 12, 2005.”
So with that change, you’d have a table like this:
ArticleID | Headline | Byline | Date | Body |
101 | Dog Bites | Dave Barry | Dec 12, 2005 | Body of the story |
102 | Record Cold | Travis Smith | Dec 12, 2005 | Body of the story |
103 | Mets Win | Travis Smith | Dec 12, 2005 | Body of the story |
As we’ve discussed, multiple small, linked tables are more efficient and easier to update than a single large table. So instead of storing the name “Travis Smith” hundreds of times, a proper relational database would have a second table with the possible list of authors. You’d then do something like this:
Table name: Bylines
AuthorID | Name |
1 | Travis Smith |
2 | Edward R. Murrow |
3 | Dave Barry |
Now, the Articles table could be changed to look like this:
ArticleID | Headline | Byline | Date | Body |
101 | Dog Bites | 3 | Dec 12, 2005 | Body of the story |
102 | Record Cold | 1 | Dec 12, 2005 | Body of the story |
103 | Mets Win | 1 | Dec 12, 2005 | Body of the story |
There are a number of advantages to this:
- It avoids misspelling someone’s byline.
- A single number linked to the Bylines table replaces a chunk of text that would otherwise be repeated over and over. This is faster for both the computer and the person entering the data.
- Changing a byline for every article in the database only requires changing one entry in the Bylines table.
Technically, this type of data linking is called a one-to-many relationship. Each byline (one) links to several articles (many). But (in this example) each article only links to one byline.
What if you want to add e-mail addresses for each writer? Simply add that information to the Bylines table:
AuthorID | Name | |
1 | Travis Smith | nep@hopstudios.com |
2 | Edward R. Murrow | ed@cbs.com |
3 | Dave Barry | dave@davebarry.com |
You don’t need to change the Articles table at all, and yet now you can add the writer’s e-mail address to every story on your site. That’s the power of relational databases.
What if there’s more than one byline on a story? Well, that gets complex, because the “Byline” field in the Articles table, as currently set up, can only handle one number. It’s not a text field – we changed it to handle a single number only because that takes up less space and lets us easily sort the bylines in interesting ways.
So, to change it so that each article can have multiple bylines, don’t store the byline in the article table. You need to create another table that will link the byilines to the articles. It would look something like this:
Table name: Articles_Bylines
ArticleID | AuthorID |
101 | 3 |
102 | 1 |
102 | 1 |
103 | 3 |
(This table needs no primary key because it’s only used for linking, and each row would be unique because each author only writes each article once anyway.)
Now, this table can have several entries, linking the Bylines table to the Articles table. When your Web page displays an article from the database, it can now also grab all the bylines linked to that article.
Relational databases can get tricky fast. For example, you might have multiple tables that link the relationship between an article and:
- Author(s).
- Categories or keywords.
- Location(s).
- Related stories.
- Photo(s).
- Editing status (for workflow purposes).
- Editing history.
You might also want to have several dates associated with each article: When it was first entered in your system, when it was originally published in another medium, when it was last revised, when it should appear on a website, or when it should come off the front page.
If you are involved in planning or reporting on a lot of events, you might want to have a table that lists the events themselves, a table for the venues in which they’re held, a table for participants to rate or comment on the events, a table for photos taken at specific events or a table for recommended ages.
For a restaurant database, you’d need a table of the restaurants with additional tables for cuisine, addresses, price categories, ratings, or online menus.
Things get complex quickly, as you can see. To read in more detail about planning tables, take a look at these articles:
If you’re really interested, there’s a book called “Philip and Alex’s Guide to Web Publishing” that offers a platform-agnostic overview of the process. It’s not only well written, but also filled with beautiful photography. (ISBN 1558605347)
Now that you understand how databases work, let’s look at some software options available to you.