Editing Data

All data when it is initially entered into the system should be checked for errors so that bad data does not get put onto permanent disk files. Remember the rule: "Garbage in, garbage out!" This process of error checking is called EDITING.

Examples of the types of errors that editing looks for are:

Edit Program:

Possible Input to EDIT Program Possible Output from EDIT Program
  • Transactions coming in on a disk
  • Transactions being keyed in
  • Written report of all error
  • Written report of valid transactions
  • Disk file of valid transactions
  • Disk file of records flagged as invalid - to be processed and corrected - frequently using on-line processing

When the data on input records is processed in an EDIT program, the errors are reported so that the data can be corrected and reentered into the system. The input data usually comes in on either a screen or a disk. The valid records are usually written on a disk and a record is made of them on a report. The bad records are definitely written on a report and they may also be written to an error disk where they can be corrected. If you are processing inactively, the bad records may be displayed on the screen so the user can fix them immediately. After corrections have been made, the record is rechecked. If the corrections were satisfactory the record gets written to the permanent disk file, if not it is either redisplayed or written to a print report.

There are two major categories of data entry. The first is data entry by a clerk that does not know the data and is entering the data as fast and as accurately as possible. Since the clerk does not know the data, this kind of data entry frequently takes what is keyed and creates a disk file to be checked in an EDIT program. The second is data entry by a clerk that knows the data and can make many of the corrections immediately if they are pointed out by an EDIT program. This kind of data entry is frequently done interactively with the clerk sitting at the computer and entering the data onto fields on the screen. The EDIT program analyzes the entry and reports errors that can be interactively fixed by the clerk. Hopeless errors or errors where there is insufficient data are report, but the rest are fixed on-line.

As can be seen, reporting is an important part of editing. Both valid and invalid records are reported - usually on separate reports, but occasionally you will see reports that mix valid and invalid record reporting. The report can be done using a variety of styles depending on the needs of the users. The important thing is that on a report of valid transactions the entire record is printed and on an error report, for each field in error, the user can identify:

A few possible error report styles:


Before we start to look at the logic of an edit program, there are some COBOL and programming techniques that you should be aware of.

Writing records to a disk

If you plan to write a record to a disk you will be creating a disk file. Therefore the file must be defined in both the SELECT statement and in the FILE SECTION with a FD. The record that you will write must pass through the 01 level of the FD. The programmer can either define the fields on the record under the 01 level of the FD in the FILE SECTION or set up the record in WORKING-STORAGE and define the fields there. In the PROCEDURE DIVISION, the file must be OPENed as an OUTPUT file since you are writing to the disk. When the record is ready to be written, the WRITE statement will be used. Since you are writing to a disk instead of the printer, there are no AFTER ADVANCING clauses. When the program is complete, the file must also be CLOSEd.



       05  IDNO-DSK                      PIC X(4).
       05  NAM-DSK		       PIC X(20).
       OPEN INPUT...
      code to setup the record my moving data to IDNO-DSK, NAM-DSK etc.

     CLOSE ...
	If the record was set up in WORKING-STORAGE, for example using 01  OUTPUT-REC,  then the 
        write statement would read like this:



The REDEFINES clause is used when you have a field that you want to look at two ways. For example a field can be given a numeric picture and then redefined and given an alphanumeric picture:
        05  FLDX			PIC 9(5).
If you use the name FLDX you are referring to a numeric field that you may use in a calculation or move to an edited field. If you use the name RDF-FLDX, you are referring to an alphanumeric field. This is useful if you are checking an incoming field to see if it is numeric. If it is you want to print it on the report as an edited numeric, if it isn't you want to print it on the report as a non-numeric. In other words, you would move FLDX if it passed your editing tests and you would move RDF-FLDX if it did not pass the tests and was therefore not numeric.

Another way you might use the REDEFINES is on the print line. For example, suppose you want to print a numeric field in a column if the field passes your input tests and print a message in the field if it does not pass the tests. This can be done using the REDEFINES or it can be done by breaking down the field into a numeric sub field. Both ways are illustrated below:

Using the REDEFINES:

	05   AMT-PR 				PIC $ZZZ,ZZZ.99.
In the PROCEDURE DIVISION, when the programmer wants to move a number to the area they would code:
If instead, they were moving a message they would code either of the following:

Breaking up the field:

	 05  MSG-PR.
	     10  AMT-PR		PIC $ZZZ,ZZZ.99.
	     10  FILLER		PIC X(9).
Two things need to be remembered. First, when a field is broken up into sub fields, the top field (the one that is being broken up) is not given a PIC. The PIC is the sum of the sub fields beneath it. Second, the field that is divided is considered to be alphanumeric even though the parts may all be numeric. In this case, the parts are a mixture, but MSG-PR is considered to be alphanumeric. In this example, if there is valid numeric data, the programmer will move the data to AMT-PR. However, the programmer decided that the error message needed more room. This caused the addition of the second 10 level which gives MSG-PR an additional nine characters beyond AMT-PR. The MOVE statements that could be used in this program to move either a number or a message to the column on the print line are illustrated below. First, to move a number the following MOVE statement could be used:
If instead the programmer wanted to move a message to the field on the print line they would code either of the following:
Before moving on to the next topic, we will examine another use of the REDEFINES that doesn't relate to editing. In dealing with percents, you want to use the decimal number for calculations and the whole number to print or display. The REDEFINES lets you set this up easily:
		05  PERC					PIC V99.
When the programmer is using the percent in a calculation they will use the name PERC which as the PICTURE of V99, however, when the program wants to move the percent to the printline, RDF-PERC is the field that will be moved.

Is Numeric or Is Alphabetic Test

COBOL has a numeric or alphabetic test that can be used to test data and make sure that it contains the expected catagorie of characters. A field can be tested to see if it contains just numeric digits or just alphabetic characters (spaces are acceptable). The test is a clause that can be used with the IF statement:
		IF {fieldname} IS NUMERIC

             error processing to handle the non-numeric data.	

NOTE:  With many compilers you cannot move a non-numeric field to a numeric output field so this 
test becomes very important.


You have already seen an indicator used to tell when the end of the file has been reached. Indicators can also be used in other ways to make the program work well. For example in an edit program, records with no errors may get written to a disk while records with errors will be printed on a report. Since there are many fields on the record, and you want to check every field for accuracy, checking of the records may involve a lot of code and several different routines. To make sure you know whether any errors have been found, an indicator can be used. For example, whenever an error is found the indicator can be set by moving "YES" to the indicator. It doesn't matter whether you move "YES" to the indicator once because you found one error or ten times because you found ten errors, the indicator saying YES will indicate that the record is invalid and therefore should not be written to the disk. The indicator should be set up in WORKING-STORAGE with the other indicators and it can be given level 88 names if you want to:
	    05  VALID-REC-IND           PIC XXX		VALUE "YES".
		88   VALID-REC				VALUE "YES".
		88    INVALID-REC			VALUE "NO ".

Sample editing program

The sample edit program (PAYEDIT.CBL) is a simple version of an edit. The input comes in on a disk, Good output is written to a disk and errors are written to a report (one error per line). Note that in the real world, good records might get written to a separate report in addition to being written on the disk. This would involve creating two printer reports and our sample does not do this.

The sample edit program has the following input and output files: The sample program is editing payroll transactions. Each transaction record is checked for the following: If an error is found a line is printed on the error report. This means if a transaction has 5 errors, there will be 5 lines on the error report. Only records with no errors are written to the output disk.

The edit program reads the initializing record. Then it performs the B-200-LOOP. In the B-200-LOOP the program sets a VALID-REC-IND to "YES". The program then PERFORMs the routine that check each field for accuracy. If the field is in error, the field and a message are written to the printer and the VALID-REC-IND gets set to "NO". Each time an error is found, another line explaining the error gets written on the report. When all the field have been tested, control returns to the B-200-LOOP and the VALID-REC-IND is checked. If the indicator still contains YES, the record is accurate and gets written to the disk and 1 gets added to the valid record count. If the indicator has been changed to NO, no record gets written on the disk and 1 gets added to the invalid record count. Note that nothing is written to the printer at this point because the error is printed as soon as it is discovered.