Creating a payroll project:

 

Notes on Relational Databases:

A database can consist of multiple tables/files. These tables/files are related to each other in some way so that the programmer or developer can access information from multiple table/files at the same time. A database management system is the database and its functionality. In today's PC environment, the model of relating the table/files within a database is the relational database model. Things that must be considered in a relational database:

Looking at these rules, we are going to analyze a simple payroll database and then put it into third normal form to make a relational database.

 

We are going to start out with this data:

 

 

 

When we look at it, we can see that some of the data is just about the employee, so we know that we need to set up an employee table with that information.

 

Our employee table will have the following fields on it:

 

 

Now we need to figure out the primary key.  The primary key is something that uniquely defines each record in the file or table.  In our case, employee # serves that purpose so we will make it the primary key:

 

Employee Table

 

Now we have the following data left to deal with:

 

 

We have to analyze the relationship of the data to each other.  The possible relationships are one to one, one to many and many to many.

 

One to one relationship:  If we look at projects.  Each project has a number and a name.  There is a one to one relationship between a project number and a project name.  This means that a project number has one and only one project name, and a project name has one and only one project number.

One to many relationship:  If we look at departments, each employee belongs to a department.  There can be many employees in one department.  The relationship is one employee belongs to one department, and one department has many employees.

Many to many relationship: One employee can work on many projects and one project can have many employees.

 

Now lets take this and build some more tables.

 

It would break the rules of third normal form if I put the department name in the table, it would also be an inefficient way to do things.  Lets say there are 50 people in the average department, that would mean I would put the department name in 50 records.  What if I decided to change the department name.  Well I would have to change it in 50 records.  It makes more sense to set up a separate table for departments.  In this example I only need to carry the department name, but in a real example I might also have things like department manager and department budget.

 

This means I need a department table.  In the department table, I need to put the department number and the department name.  I am going to use department number as the primary key to uniquely define each record. 

 

Department Table

 

Now looking at what we have:

 

Employee Table

  • Employee Number - primary key
  • Employee Name
  • Salary
  • Tax percent withheld
  • Medical withheld
  • Other deductions withheld
  • Department Number

 

Department Table

  • Department Number - primary key
  • Department Name

 

 

Now we have two tables that each have the Department Number.  We set it up this way so that we would have a relationship between the two tables.  We can use the Department Number on the Employee Table to link into the Department Number on the Department Table to get the Department name and whatever other information about the department we might have decided to include. 

 

In database vocabulary, the Department Number on the Employee Table is called a foreign key because it links to the primary key in another table.

Now we have taken care of the Department Name, but we still have this information left that is not in a table.

 

 

Looking at this I see two things we have to deal with.  The project it self and the employees projects and hours they spent working on the project.  Just because it is easier, I am going to deal with the project itself first.  I need to build a table for the project. 

 

Project Table

 

This looks a lot like the department table.  I have a project number that uniquely identifies the project, and then I have a project name that has a one to one relationship with the project number and a project manager that has a one to one relationship with the project number.

 

I still have to deal with the time the employee spends on the project.  I have the following information left to deal with:

 

 

I cannot put the project information on the employee table because each employee can be assigned to more than one project.  Making a relational database in third normal form means that I cannot have repeating groups.  There are a lot of problems with repeating groups, but one of the major ones is how many.  If you decide that an employee can be working on a maximum of four projects, then you could put four repeating groups of projects on the employee file even though this would be breaking the rules.  But, then what do you do when an employee gets assigned to a fifth group.  The whole thing falls apart because you did not set up the table to accomodate five repeating groups. 

 

Looking back, remember that I said there was a many to many relationship here.  One employee can work on many projects and one project can have many employees.  This means I cannot put the employee number on the project table because I do not know how many employees will be assigned to the project and besides it breaks the rules of repeating groups.  I also cannot put the project number on the employee file because again I do not know how many and it would be breaking the repeating group rule.

The solution to this is to build a bridge table.  The bridge table will contain the employee number and the project number and the number of hours the employee worked on the project.  I also need to add the date so that I know which week the employee worked on the project.  This was not in the original data list, but when I go to set it up it is clear that it is needed since I am designing this for payroll purposes. I will call this the EmployeeProject Table

 

Employee Table

  • Employee Number - primary key
  • Employee Name
  • Salary
  • Tax percent withheld
  • Medical withheld
  • Other deductions withheld
  • Department Number

EmployeeProject Table

  • Employee Number
  • Project Number
  • Date
  • Hours employee worked on the project

Project Table

  • Project Number - primary key
  • Project Name
  • Project Manager

 

 

This works.  From the Employee Table I can link in based on Employee Number and from the Project Table I can link in based on Project Number. This means the tables are related and I can get the information I need.

 

Now I need to decide on the key for the Employee Project Table.  I cannot use Employee Number alone because one Employee can have many entries here so it would not be unique.  I cannot use Project Number alone because on Project can have many entries so it would not be unique.  Next, I thought about combining the Employee Number and the Project Number together as the key, but that will not be unique because the employee can work on the project multiple times.  Finally I decided that if I combine Employee Number and Project Number and the date I will have a unique key.  A employee works only once on a project on a given date. 

 

EmployeeProject Table

 

The primary key is composed of the three fields taken together. Notice that the only piece of data on this table that is not part of the EmployeeProject Table is the hours employee worked on the project.  The rules of third normal form says that data has to all relate to the primary key (or in this case all of the parts of the primary key).  This follows the rule.  The hours worked are for a particular employee, working on a particular project on a particular date.

 

I now have my four tables:

Employee Table

 

Department Table

 

Project Table

 

EmployeeProject Table