Chapter 4. The COBOL Language

COBOL is the standard language for mainframe application development. It has the types of features that are important for business use cases, such as handling large-scale batch and transaction processing jobs.

The COBOL language has over 350 commands—and many of these you will not need to know about. This is why we’ll cover a limited number in this book. But this should not imply that you will be at a disadvantage. We will focus on the core commands you need to know for real-world applications.

COBOL’s Background, in Brief

COBOL is one of the oldest computer languages. Yet it has remained robust over the years and remains pivotal for business computing.

The roots of the language go back to the late 1950s, when a variety of computer languages emerged. Many of these languages were complex. This meant development was time-consuming and expensive.

A standard language was needed for data processing. To make this happen, the US Department of Defense joined with a group of computer companies—including IBM, Burroughs Corporation, Honeywell, and RCA—as well as academics and customers to form the Conference on Data Systems Languages (CODASYL) committee. Such committees have been essential for the evolution of the language.

Note

CODASYL looked at the FLOW-MATIC language as a model for COBOL. Legendary computer pioneer Grace Hopper created FLOW-MATIC, the first language to use English-like commands for data processing and to be used for early mainframe systems, like UNIVAC I.

One of the key considerations for CODASYL was to enable COBOL to operate on different computers. It was also focused on the needs for businesses—say, for helping with accounting and customer reporting. This focus has remained the same today. In fact, you can’t use COBOL to create websites or mobile apps. It’s purely about business applications.

Note

The CODASYL committee came up with several ideas for the COBOL language. Some included Information System Language (INFOSYL), Business System (BUSY), and Common Computer Systems Language (COCOSYL). But ultimately, the CODASYL committee decided on COBOL, although it is not clear why.

COBOL Versions

The first version of COBOL, referred to as COBOL 60, came out in 1959. It certainly had its flaws, and some people predicted that the language would not last long. But the computer industry took steps to solve the problems and improve the language, especially with the development of compilers. Then new features, like tables, were added.

However, as the language grew in popularity, more incompatibilities emerged. This is why the America Standards Institute—now called the American National Standards Institute (ANSI)—took on the role of creating a standard for COBOL. This was done in 1968 and was called COBOL X3.23.

This is not to imply that the CODASYL committee was no longer a factor. The organization would continue to innovate the language.

But by the 1980s, COBOL would again have problems with compatibility. To deal with this, a new version was released, called COBOL 85.

By the 1990s, more changes were needed, and work began on a new version that would adopt more modern approaches, such as object-oriented programming. The new version of the language was called COBOL 2002.

OK then, so what is the latest version? It is COBOL V6.3, which was shipped in September 2019.

Regardless, many companies still use older versions, like COBOL 85. This is why it is important to understand the history of the language. For the most part, adoption of new approaches tends to be slower with mainframe systems. A key reason is that companies usually do not want major changes made to their mission-critical systems.

Why Use COBOL?

COBOL is not a general-purpose language. Its main focus is on data processing. But COBOL’s specialization and long history have meant that the language is pervasive. Here are some stats to consider:

Every day 200 times more COBOL transactions are performed versus Google searches.
More than 220 billion lines of code are running today, or about 80% of the world’s total.
About 1.5 billion new lines of COBOL are written each year.

According to Dr. Cameron Seay, the cochair of the Open Mainframe Project COBOL Working Group and an adjunct instructor at East Carolina University, “COBOL remains an essential language for the global economy. The list of organizations that use COBOL also includes most large federal and state agencies. As of today, COBOL is irreplaceable, and there is no indication that that is going to change.”

What are some of the benefits of COBOL? Why has the language had so much lasting power? Here are some of the main reasons:

Scale: COBOL is built to process large amounts of data. The language has a broad range of functions for creating data structures and accessing them.
Stability: The COBOL language is backward compatible. As a result, companies do not have to periodically recode their systems.
Simplicity: Again, the original vision for COBOL was to be easy to use. You did not have to be a mathematician to learn it. It’s true that the language can be wordy. But this carries an advantage. In a sense, the language is self-documenting (although it is still a good idea to provide your own documentation).
Auditability: Even if you do not understand COBOL, you can still read its commands and get a general idea of its workflows. A key benefit is that a nontechnical auditor can review the code.
Structure: COBOL has a set of predefined ways to create programs, such as with divisions, sections, and paragraphs (you’ll learn more about these in this chapter). This makes it easy for someone who did not write the code to understand it.
Speed: COBOL is a compiled language, which means that a program will be reduced to the 1s and 0s that a computer can understand. This generally speeds up performance, compared to an interpreted language (which involves an intermediate translator that converts the code during runtime).
Flexibility: The standard COBOL language is full-featured and has been well tested in intensive enterprise environments. But a myriad of extensions are available, such as for databases and transaction systems. This has made COBOL much more versatile.
Math: COBOL has a variety of features that make it easier to use currency manipulation and formatting. Other languages usually require coding for this.

COBOL Program Structure: Columns

A COBOL program has a clear-cut organization, with code arranged in 80 columns. This number harkens back to the days of punch cards. Figure 4-1 shows a visual of the layout.

Here’s a look at the columns:

1–6

This is for the line numbers. When punch cards were used, this was helpful since sometimes they would fall on the floor and get scattered. But in modern times, columns 1–6 are no longer used.

7

This can be used for several purposes. If you put an asterisk (*) in the column, you can write out a comment to document the code. Figure 4-1 shows a comment on sequence line 000012. You can also use the hyphen (-) as a continuation line for a long line of characters known as a literal. This is mostly for readability. This is an example:

'123ad53535d3506968223dcs9494029dd3393'
- '8301sd0309139c3030eq303'

The literals can be either strings or numerics.

8–11

This is known as the A margin, or Area A. This is where we put the main headers for the code, which include the division, section, paragraph, and level numbers (01 and 77), which you will learn more about later. In Figure 4-1, the IDENTIFICATION DIVISION and PROCEDURE DIVISION are in Area A.

12–72

This is called the B margin, or Area B. This is where much of the code will be included.

73–80

This is no longer used in COBOL development.

COBOL Program Structure: Divisions

COBOL programs are further organized into four divisions, which need to be in the following order (each ends with a period):

IDENTIFICATION DIVISION.
ENVIRONMENT DIVISION.
DATA DIVISION.
PROCEDURE DIVISION.

Each of these can contain other levels of code. They include sections, paragraphs, and sentences. And all of these end in a period if they are in any of the divisions except for the PROCEDURE DIVISION. Otherwise, only the paragraph has a period. Granted, all this may seem kind of convoluted and inflexible. But again, in a business environment, it is important to have a solid structure. Besides, COBOL’s approach is fairly intuitive once you get used to it. So in the next few sections, we’ll go into more detail on the structure.

IDENTIFICATION DIVISION

The IDENTIFICATION DIVISION is the easiest division to work with. You need only two lines:

IDENTIFICATION DIVISION.
PROGRAM-ID. CUSTRP.

The PROGRAM-ID is required because the name is used for the compilation of the program. The name can be up to 30 characters and must be unique. But a mainframe shop usually has its own requirements (a typical length is up to eight characters).

Some coders may expand on the IDENTIFICATION DIVISION, such as with the following:

IDENTIFICATION DIVISION.
PROGRAM-ID. CUSTRP.
AUTHOR.  JANE SMITH.
DATE-WRITTEN.  01/01/2021
**************************************************************
*  This program will generate a customer report   *
**************************************************************

Details such as AUTHOR and DATE-WRITTEN are not common. But a comment box is often used.

Note

COBOL is not case sensitive. You can write a command like DIVISION as Division or division or even divIsion. It does not matter. However, for the most part, the COBOL convention is to capitalize the commands.

ENVIRONMENT DIVISION

The ENVIRONMENT DIVISION is used for accessing files—say, for batch processing—which is common in COBOL. But this is usually not used if a program is for online tractions (this would be for commands like ACCEPT to get user input).

The ENVIRONMENT DIVISION is composed of two sections. One is the CONFIGURATION SECTION, which provides information about the computer and certain settings (such as for currency):

ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
   SOURCE-COMPUTER. IBM ENTERPRISE Z/OS.
   OBJECT-COMPUTER. VAX-6400.
   SPECIAL-NAMES.
   CURRENCY IS DOLLARS.

Next is the INPUT-OUTPUT SECTION. This is where a program makes connections to files:

ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
 SELECT CUSTOMER-FILE ASSIGN TO CUSTOMERMAST
	ORGANIZATION IS SEQUENTIAL.

CUSTOMER-FILE is the internal name, which is how we refer to it within the COBOL code. This name is then associated with CUSTOMERMAST, which is how the mainframe filesystem identifies the file. To make this connection, you create a Data Definition (DD) statement in Job Control Language (JCL) that will reference this file. What’s more, the internal name and the filename can be the same.

Why do all this? The main reason is that if the name of the file changes on the hard drive, only the DD name needs changing in the JCL. This can avoid a lot of reworking of the source code.

Finally, the SELECT command can have various parameters. In our example, ORGANIZATION shows how the records in the file are processed (one record at a time).

DATA DIVISION

When developing in COBOL, you will usually spend considerable time creating the data structures. This is done in the DATA DIVISION, which has three main sections: WORKING-STORAGE SECTION, FILE-SECTION, and LINKAGE-SECTION.

We will cover the first two next. The LINKAGE-SECTION—which allows for the passing of data to outside programs that are called—is covered later in this book.

WORKING-STORAGE SECTION

In the WORKING-STORAGE SECTION, you create the variables, which hold the data. But in COBOL, these variables are usually referred to as fields.

Fields are also only global, which means they are accessible from anywhere in the program. This is in contrast with modern languages, which have both local and global variables. A local variable is available for only a particular function or block of code.

It’s true that having only global variables is not ideal and can cause problems. So when developing a COBOL program, it’s important to map how variables may change.

Now let’s take a look at an example of a data structure:

WORKING-STORAGE SECTION.
	01 INVOICE-NUMBER	PIC 99	 	PACKED-DECIMAL 		VALUE 0.

This is called an elementary item because it has only one field. The definition also has five parts: the level number, field name, PIC clause, USAGE clause, and VALUE clause. These are described next, followed by a discussion of data groups and special level numbers.

Level number

This is from 01 to 49 and refers to the hierarchy of the data. Each field has its own level number.

Note

The typical approach is to use increments of 5 for level numbers. This is to make it easier to add new level numbers if there is a change to the data structure.

Field name

This can be up to 30 characters long. This is to allow for descriptive names.

What’s more, you can have a field without a name, such as this:

01 		FILLER		PIC X (100).

This is a string of 100 characters. Why have something like this? It can be useful in creating reports.

PIC clause

Short for picture, this specifies the number of digits or characters a field can have. In the preceding example, PIC 99 can have two digits. But you can express it as PIC 9(2) as well. This PIC is known as a numeric and can hold only numbers. However, if you use S in front—say, as PIC S99—this will allow for + and –.

What about decimals? You use V for this. It’s known as an implied decimal. An example is PIC 99V99, which provides for up to two decimal points.

A variation on numeric fields is numeric edited fields. This variation is used to format a number, such as for currencies, dates, and so on. Here’s a look at some:

++99/99/99++

++$999.99++

The PIC clause can contain two other data types. One is alphanumeric, a string that can hold any character. It’s expressed as PIC X. Yes, this will have one character. Or if you have PIC XXX or PIC X(3), it will hold three characters.

You can then use an edited field for an alphanumeric. Here’s an example for a phone number:

++XXXBXXXBXXXX++

The B represents a blank.

The second is the alphabetic data type. It allows only uppercase and lowercase characters of A through Z. An example is PIC A(10). However, for the most part, COBOL programmers do not use alphabetic data types; instead, they code with the alphanumeric. But you still may see some alphabetics when updating older code.

No doubt, we need to learn quite a lot to understand PIC clauses completely. So it is probably a good idea to provide more examples to get a sense of the differences and how to use them; see Figure 4-2.

In the second row, the name Tom has all letters. This is why we use an alphanumeric. We also set the length at 3. But when it comes to names, it is often a good idea to allow for more space for longer ones.

The third row has an address. Even though it includes numbers, it also has characters. This is why we use an alphanumeric.

As for the next row, the hyphens are characters. Thus, we again use an alphanumeric, and the size is 12 to accommodate the size of a phone number.

The last two rows have numbers, but different types. The first is an integer, which is why we use PIC 9(4). The second one, though, has a decimal, so we use V for the implied decimal.

USAGE clause

This specifies the kind of data to be stored. If you omit this, the default is DISPLAY (this is also known as ZONED-DECIMAL DATA). As the name implies, this is for when you want to print the data.

The PACKED-DECIMAL, on the other hand, is for when you want the data used for math purposes. Granted, DISPLAY can do this as well, but the computer will need to do a translation, which will take more time and resources.

Older IBM systems have a different naming convention. A COMP-3 is the same as PACKED-DECIMAL. There is also COMP-4, which is for BINARY. This is used for indexing data and is usually not good for math, because there could be rounding differences. Let’s face it, when it comes to business transactions, every penny matters. This is why—when it comes to math for COBOL—it’s usually best to stick with PACKED-DECIMAL.

VALUE clause

This is optional. But if you decide to use it, the VALUE clause will set the initial value. In the preceding example, we did this by setting INVOICE-NUMBER to 0.

You can use the VALUE clause for an alphanumeric as well. Here’s an example:

01 FIRST-NAME	PIC X(20) 	VALUE 'Jane'.

Notice we do not use PACKED-DECIMAL. This is because this is only for numerics.

The data group

Data in business is often put into groups. For example, you may have a customer record, which will have the name, address, phone number, credit card, and so on. COBOL has the ability to group data by using level numbers:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 CUSTOMER-RECORD.
  05  CUSTOMER-NUMBER   PIC 9(5).
  05  CUSTOMER-NAME     PIC X(20).
  05  CUSTOMER-ADDRESS  PIC X(50).

Notice that CUSTOMER-RECORD does not have a PIC. That’s because it is a group description, not a variable. But you can still use it to do interesting things. You can set everything to blank characters:

MOVE SPACES to CUSTOMER-RECORD.

Or you can have your own character:

MOVE "*" TO CUSTOMER-RECORD

Then how do we change the value of the fields in the group? We can do the following:

MOVE 111 TO CUSTOMER-NUMBER

Or this:

MOVE "125 MAPLE AVENUE, LOS ANGELES, CA" TO CUSTOMER-ADDRESS

You can also use MOVE to create a customer record by using one line:

MOVE "12345Jane Smith         100 Main Street" TO CUSTOMER-RECORD.

You can get more granular when using groups. Here we provide more detail for CUSTOMER-NAME:

01 CUSTOMER-RECORD.
  05  CUSTOMER-NUMBER PIC 9(5).
  05  CUSTOMER-NAME.
		  10 FIRST-NAME PIC X(10).
		  10 LAST-NAME PIC X(10).
  05  CUSTOMER-ADDRESS PIC X(50).

Special level numbers

COBOL has various special level numbers. Two of them, 66 and 77, are rarely used. The one that still has relevance is 88. Level 88 is actually fairly unique for computer languages, as it allows you to streamline the use of conditions in your code.

To understand this, let’s consider an example. Suppose we have a customer base that has different levels of subscriptions: Free, Premium, and Enterprise. We can set this up using an 88 level number as follows:

01  CUSTOMER-CODE         PIC X.
  88  FREE-VERSION          VALUE 'F'.
  88  PREMIUM-VERSION       VALUE 'P'.
  88  ENTERPRISE-VERSION    VALUE 'E'.

In the PROCEDURE DIVISION, we can then designate the CUSTOMER-CODE by using the TRUE condition and evaluate it, as shown here:

SET PREMIUM-VERSION TO TRUE
IF (CUSTOMER-CODE = 'P')
DISPLAY 'The customer code is Premium'
END-IF

By setting PREMIUM-VERSION to TRUE, we have selected P for CUSTOMER-CODE.

Note that you can use TRUE only when it comes to designating which 88 element you want. Setting it to FALSE would be ambiguous and result in an error.

You can take other approaches with the level 88 condition. Let’s suppose you have data that has multiple values, such as for the grouping of regions for customers:

01 		CUSTOMER-REGION			PIC X(2).
  88	NORTH-AMERICA     VALUES 'US' 'CA'.
  88	EUROPE            VALUES 'UK' 'DE' 'FR'.
  88	ASIA              VALUES 'CN' 'JP'.

MOVE 'UK' TO CUSTOMER-REGION

IF EUROPE
	DISPLAY 'The customer is located in Europe'
END-IF

In this case, the condition has been set to UK, and the IF condition is executed since it is in Europe.

Next, you can use ranges for the 88 conditions. Here’s a look at how it is done:

01  COMMISSIONS PIC 9(2) VALUE ZERO.
  88 UNDER-QUOTA VALUE 0 THRU 10.
  88 QUOTA VALUE 11 THRU 30.
  88 OVER-QUOTA VALUE 31 THRU 99.

MOVE 5 TO COMMISSIONS

IF UNDER-QUOTA
	DISPLAY 'The sales are under the quota.'
END-IF

This is a range for a salesperson’s quota. As only five units were sold, this person was under quota.

All in all, an 88 condition can make the logic of a program easier to follow. It also usually requires less coding when changes are made.

FILE-SECTION

The FILE-SECTION may sound repetitive. As we’ve seen earlier in this chapter, the ENVIRONMENT DIVISION has robust capabilities for files.

So what is the FILE-SECTION all about? You can set a filename to be used for running the program via JCL and also make the necessary associations with the data structures. The storage for this will be outside the COBOL program and will not be created until you use the OPEN command in the PROCEDURE DIVISION (you will learn more about this in Chapter 5).

Here’s what a FILE-SECTION looks like:

FILE SECTION.
FD   CUSTMAST.
01   CUSTOMER-MASTER
  05  CUST-NUM     PIC 9(2)
  05  CUST-FNAME   PIC X(20).
  05 CUST-LNAME    PIC X(2).
FD   SALES-REPORT.
01   PRINT-REPORT PIC X(132).

FD is an abbreviation for file definition, which is the internal name used in the ENVIRONMENT DIVISION. This is to make sure the correct file is being accessed.

Constants

Constants are a standard feature in most modern languages. They allow for having fixed values (say, for the tax rate or pi).

But COBOL does not have constants. You instead have to use a field, which you can change at any time. And yes, this is certainly a drawback to the language.

However, COBOL does have figurative constants. These fixed values are built into the language: ZERO, SPACE, NULL, ALL, HIGH-VALUES, LOW-VALUES, and so on.

REDEFINES command

In some cases, you might want to define a field in different ways. This is where the REDEFINES command comes in:

01  PHONE-NUMBER      PIC 9(10).
01  PHONE-NUMBER-X    REDEFINES PHONE-NUMBER.
  05  AREA-CODE          PIC 9(3).
  05  TELEPHONE-PREFIX   PIC 9(3).
  05  LINE-NUMBER        PIC 9(4).

In this example, we have two fields for the phone number—one that is an elementary item and the other a data group, which provides more granularity.

We can use the REDEFINES for an alphanumeric as well:

01  PRODUCT-PRICE		PIC	$ZZ9.99.
01  PRODUCT-PRICE-X		PIC	REDEFINES PRODUCT-PRICE PIC X(6).

We first set PRODUCT-PRICE as an edited numeric. Then we turn it into an alphanumeric so as not have the formatting information, which means it will be easier to perform calculations.

When you use REDEFINES, both fields refer to the same bytes in memory. A change in one will be reflected in the other. Also, both must have the same level numbers, and you can use the VALUE clause for only the first one.

PROCEDURE DIVISION

In the PROCEDURE DIVISION, you perform the logic of the program. True, you could just write a long list of commands, but this will make it difficult for readability. This is why it is recommended to write COBOL in a structured manner. You break up the code into chunks, which are known as subroutines, functions, or procedures. It’s a good idea for each of these to perform a certain task.

Let’s look at an example:

IDENTIFICATION DIVISION.
PROGRAM-ID.  PRINTNAME.
ENVIRONMENT DIVISION.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 USER-NAME  PIC X(15).
PROCEDURE DIVISION.
100-GET-USER-INPUT.
DISPLAY "Enter the user name"
	ACCEPT USER-NAME.
200-PRINT-USER-NAME.
DISPLAY "The user name is " USER-NAME.
300-PRINT-PROGRAM-END.
GOBACK.

This is an easy program, but it provides a way to create a modular structure. A COBOL convention is to divide a program into paragraphs. These have a header—such as 100-GET-USER-INPUT—that should describe the task. The order of the paragraphs does not matter. But the typical approach in COBOL is to have them in the same sequence as the workflow, and each paragraph has a number. For example, if you are using a modern IDE, it will show an outline view of the code that is based on the order of the paragraphs. As you can see in our code sample, the commands in a paragraph do not have a period except at the end. This is known as period-less coding. This is to help avoid problems like unintended terminations of a paragraph.

Note

The DISPLAY command is often used for debugging. It’s an easy way to print out fields to see if everything is being processed correctly. An example is DISPLAY "X field is = " X.

It is a COBOL convention to have a paragraph that has no code (300-PRINT-PROGRAM-END), to mark the end of the program. The GOBACK command terminates the program. So for the rest of the chapter, we will look at the main types of commands and workflows for the PROCEDURE DIVISION.

MOVE command

A modern language has variable assignments. For example, in Python you can do something like this:

price = 100

But COBOL has no variable assignments. Then what’s the alternative? How can you set the values for a field?

You can use the MOVE command. Instead of moving the value from right to left, the reverse is true, as seen here, which would be in the PROCEDURE DIVISION:

MOVE "SMITH" TO LAST-NAME.

Or you can move the value of one field to another field:

MOVE LAST-NAME TO LAST-NAME-2.

The value of LAST-NAME will be copied to LAST-NAME-2. That is, LAST-NAME will still have SMITH.

When using the MOVE statement, you can work with multiple fields. For example, this will copy 0 to COUNTER, X, and Y:

MOVE   0   TO  COUNTER X Y

But you need to be careful when using MOVE with fields that have different types. Let’s first take a look when working with an alphanumeric. We will create a field as follows in the DATA DIVISION:

01	LAST-NAME	     PIC X(7)     VALUE	"COOK".

Then we will do the following MOVE in the PROCEDURE DIVISION:

MOVE  "Davis"   To    LAST-NAME.

LAST-NAME will now have the value of Davis. Yet there is something else to keep in mind. Since Davis has fewer characters than the length of PIC X, the compiler will add spaces to the field (the * represents a space). You can see the output in Figure 4-3.

This can cause formatting issues, such as with the spacing on a report. But there are ways to correct for this, which you will learn about later in this book.

What if we have a field that has more characters then PIC X? This can cause even a bigger problem. Suppose we have this:

MOVE 	"Dumbledore"	TO		LAST-NAME.

This results in Figure 4-4.

As you can see, the name is cut off—which is known as truncation. This is why it is critical to have well-thought-out data structures.

The same goes for numerics. Suppose we have this in the DATA DIVISION:

01 	PRICE		PIC	9(3)V99.

This means we have a field with five digits, which includes two decimal places. Now let’s look at some MOVE statements:

MOVE	57.2 		TO	PRICE

Figure 4-5 shows what the compiler allocates.

If a number does not fit the length provided in the PIC 9, 0s will be added.

We can also have truncation, such as with this:

MOVE	8803.257 		TO	PRICE

Figure 4-6 shows the result.

The number is first aligned along the decimal point. Since 8803 is too big for the three spaces provided, the 8 is not included. The 7 in the decimal is excluded as well, and no rounding occurs.

Truncation is common when handling math. Thus, it is important to think of the potential outliers with the calculations when putting together the data structures. The ON SIZE ERROR command can help avoid the problems, and we’ll look at this later in this chapter.

You can use MOVE where there are different PICs so long as the sending field is an unsigned integer. These are the options:

Alphanumeric to numeric
Alphanumeric to numeric edited
Numeric to alphanumeric

Here’s an example of the first one. Enter this for the DATA DIVISION:

01		ALPHA-NUM		PIC X(2)		VALUE '50'.
01		NUM-VALUE		PIC 9(2)		VALUE 0.

Then use this in the PROCEDURE DIVISION:

MOVE ALPHA-NUM TO NUM-VALUE

The result will be that NUM-VALUE will have the value of 50.

Math commands

COBOL has two main approaches with math. It has a set of commands like ADD, SUBTRACT, MULTIPLY, and DIVIDE. Then COMPUTE allows for more sophisticated calculations.

ADD, SUBTRACT, MULTIPLY, and DIVIDE

To see how the ADD, SUBTRACT, MULTIPLY, and DIVIDE commands work, let’s first have the following declarations for the DATA DIVISION:

01	WITHDRAWAL     PIC 9(3)	VALUE 0.
01	DEPOSIT        PIC 9(3)	VALUE 0.
01	BALANCE        PIC 9(3)	VALUE 0.

Then with ADD, we can do this in the PROCEDURE DIVISION:

MOVE 50 TO DEPOSIT
ADD DEPOSIT TO BALANCE

BALANCE will now be 50. Or we can do this with our DEPOSIT:

ADD 25 TO DEPOSIT GIVING BALANCE

With this, we add 25 to DEPOSIT and replace the value of BALANCE with 75.

Now let’s take a look at SUBTRACT:

MOVE 60 TO WITHDRAWAL
SUBTRACT WITHDRAWAL FROM BALANCE

Since BALANCE had been set to 75, the new result would be 15.

Suppose we have three checks for the amounts of 100, 125, and 395. We can use the ADD command this way:

ADD 100 125 359 TO DEPOSIT GIVING BALANCE

The numbers add up to 584, and BALANCE will be replaced with this number. You can also use GIVING with SUBTRACT. Say we have three withdrawals for 50, 125, and 200 as well as a deposit of 450:

MOVE 500 TO DEPOSIT
SUBTRACT 50 125 200 FROM DEPOSIT GIVING BALANCE

The total deposits of 375 will be subtracted from 500, giving the result of 125. BALANCE will then be equal to this amount.

Next, let’s take a look at the MULTIPLY command. We’ll first create two fields in the DATA DIVISION:

01  INCOME       PIC 9(5)V99		VALUE 500.
01  NET-INCOME   PIC 9(5)V99		VALUE 0.

Assume that the tax rate is 10%. Then in the PROCEDURE DIVISION, we can have this:

MULTIPLY .10 BY INCOME

The result, which is 50, will be put into INCOME. Or we can use the GIVING command:

MULTIPLY .10 BY INCOME GIVING NET-INCOME

NET-INCOME will be replaced by 50.

Division has some differences, though. You can use two main approaches: DIVIDE INTO or DIVIDE BY.

Here’s a look at the first, which has this for the DATA DIVISION:

01  SALES		PIC 9(5)		VALUE 10000.
01  UNITS		PIC 9(4)		VALUE 500.
01  SALES-PER-UNIT	PIC 9(5)		VALUE 0.

Then this is for the PROCEDURE DIVISION:

DIVIDE UNITS INTO SALES

With this, SALES will now be equal to 20. Or we can use the GIVING command, which will give us the same result but put it in SALES-PER-UNIT:

DIVIDE UNITS INTO SALES GIVING SALES-PER-UNIT

Suppose we change the values to the following in the PROCEDURE DIVISION:

MOVE 2000 TO SALES
MOVE 192 TO UNITS

We can then do the calculation this way:

DIVIDE SALES BY UNITS GIVING SALES-PER-UNIT ROUNDED

The result of this formula is 10.41666. But since the PIC 9 for SALES-PER-UNIT does not have a decimal, we have instead rounded the number—which gets us 10.

We can also get the remainder of a division. In the DATA DIVISION, let’s have the following:

01  QUOTIENT    PIC 999		VALUE 0.
01  REM         PIC 999		VALUE 0.

Then we have this for the PROCEDURE DIVISION:

DIVIDE 100 BY 9 GIVING QUOTIENT REMAINDER REM.

In this, the QUOTIENT is 11 and the REMAINDER is 1.

COMPUTE

The use of math commands like ADD and SUBTRACT are unique for modern languages. But the COMPUTE command looks more like what you would see in something like Python or C#.

To see how the COMPUTE command works, let’s first set up some fields in the DATA DIVISION:

01  DISCOUNTED-PRICE     PIC 9(5)      VALUE 0.
01  RETAIL-PRICE         PIC 9(5)      VALUE 0.
01  DISCOUNT             PIC 9(2)V99   VALUE 0.

Now let’s do the calculation in the PROCEDURE DIVISION:

MOVE 0.25 TO DISCOUNT
MOVE 100 TO RETAIL-PRICE
COMPUTE DISCOUNTED-PRICE = RETAIL-PRICE * (1 - DISCOUNT)

We set DISCOUNT for this product to 25% and RETAIL-PRICE to 100. Then with the COMPUTE formula, we subtract 1 from DISCOUNT and multiply the result by RETAIL-PRICE. This gives us the DISCOUNTED-PRICE.

The mathematical operators for COBOL are similar to what you would see in other modern languages. You can find them in Table 4-1.

Table 4-1. The mathematical operators for COBOL
Mathematical operator		Function
`+`		Addition
`-`		Subtraction
`/`		Division
`*`		Multiplication
`**`		Exponent

Note

The use of an exponent is expressed like this: COMPUTE A = 2**2. This is 2 to the second power, or 2 squared.

If you use the parentheses, the calculations within them will be executed first. After this, the order of operations starts with the exponents, then multiplication, division, subtraction, and addition. For the most part, programmers rely on parentheses.

A common issue with COMPUTE arises when the fields are not large enough to hold the numbers. As we’ve seen, this will cause truncation, and a way to deal with this is to use the ON SIZE ERROR clause. Let’s look at an example, with the following in the DATA DIVISION:

01		SALES		PIC 9(4)		VALUE 0.
01		PRICE		PIC 9(1)		VALUE 5.
01		UNITS		PIC 9(4)		VALUE 5000.

Then here’s the PROCEDURE DIVISION:

COMPUTE	SALES = PRICE * UNITS ON SIZE ERROR
DISPLAY "The amount is too large for the SALES field."

The result of this formula is 25000. However, the SALES field can hold up to only four digits. Because of this, the ON SIZE ERROR clause is triggered. This can be an effective way to avoid the crashing of a program.

Math Functions

COBOL comes with a rich set of 42 mathematical functions, such as for finance, statistics, and trigonometry. You use the FUNCTION command to execute them, and there may be zero, one, two, or more arguments. Table 4-2 shows a list of common functions.

Table 4-2. Common mathematical functions in COBOL
Function		What it does
`SUM`		Sum of the arguments
`SQRT`		Square root of an argument
`MEAN`		Average of the argument
`SIN`		Sine of an argument
`VARIANCE`		Variance of an argument
`STANDARD-DEVIATION`		Standard deviation of a set of arguments
`RANGE`		Maximum argument minus the minimum argument
`MAX`		Value of the largest argument
`MIN`		Smallest argument
`LOG`		Natural logarithm

Let’s take a look at some examples, which are placed in the PROCEDURE DIVISION:

DISPLAY FUNCTION SUM (50 59 109 32 99)
DISPLAY FUNCTION SQRT (100)

The results are 349 and 10, respectively. You can also use the functions with the COMPUTE command.

Conditionals

The IF/THEN/ELSE construct is at the heart of all languages. It’s a key part of the control of the flow of a computer program and is based on Boolean logic, which looks at whether something is either true or false.

But COBOL has its own approach—and it can be tricky. So a good way to explain conditionals is to consider some examples:

DATA DIVISION.
WORKING-STORAGE SECTION.
01  TEMPERATURE PIC 9(3) VALUE 0.
PROCEDURE DIVISION.
DISPLAY "Enter the temperature : "
ACCEPT TEMPERATURE
IF TEMPERATURE <= 32 THEN
  DISPLAY "It is freezing"
ELSE
  DISPLAY "It is not freezing"
END-IF

GOBACK.

In COBOL, this is called a general relation condition. Despite its long name, this condition is straightforward. The ACCEPT command takes in user input, which in this case is the current temperature. If the temperature is 32 degrees or less, it is freezing.

COBOL has English-like versions of conditionals. For example, instead of using >, you can use GREATER THAN. Table 4-3 provides a list of conditionals.

Table 4-3. Conditionals in COBOL
Shorthand		English-like version
`>`		`GREATER THAN`
`<`		`LESS THAN`
`=`		`EQUAL TO`
`>=`		`GREATER THAN OR EQUAL TO`
`⇐`		`LESS THAN OR EQUAL TO`
`Not>`		`NOT GREATER THAN`
`Not<`		`NOT LESS THAN`
`Not=`		`NOT EQUAL TO`

You can write more-complex conditionals by using AND and OR, which are called compounded conditional expressions. To see how this works, here’s an example program for the approval of invoices:

DATA DIVISION.
WORKING-STORAGE SECTION.
01  INVOICE-AMOUNT PIC 9(4) VALUE 0.
PROCEDURE DIVISION.
DISPLAY "Enter the invoice amount : "
ACCEPT INVOICE-AMOUNT
IF INVOICE-AMOUNT > 0 AND INVOICE-AMOUNT < 5000 THEN
  DISPLAY "No approval is needed"
ELSE
  DISPLAY "There must be approval"
END-IF
GOBACK.

In this code, if the invoice is between $0 and $5,000, no approval is needed.

Again, this is simple and similar to what you would see in other languages. So then what about the different types of conditionals? One is the class condition. However, the word class does not refer to object-oriented programming.

In fact, earlier in this chapter, we saw how the class condition was used. It involved the 88 level numbers to set forth a range of values or text—and a condition would be triggered if a value falls within it.

Something else we can use for a condition is the EVALUATE command. It is similar to a switch/case construct that is found in other languages. When there are a multitude of possibilities, EVALUATE can be much easier to use than a simple IF/THEN/ELSE structure.

Let’s suppose we are creating an app to track customer information and want a way to designate the type of business entity:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 BUSINESS-NUMBER 		PIC 99 VALUE ZERO.
01 BUSINESS-TYPE 	 	PIC X(20).
PROCEDURE DIVISION.
DISPLAY "Enter the business number"
ACCEPT BUSINESS-NUMBER
EVALUATE BUSINESS-NUMBER
WHEN 1 MOVE "Sole Proprietor" TO BUSINESS-TYPE
WHEN 2 MOVE "Single-Member LLC" TO BUSINESS-TYPE
WHEN 3 MOVE "S Corporation" TO BUSINESS-TYPE
WHEN 4 MOVE "C-Corporation" TO BUSINESS-TYPE
WHEN 5 MOVE "Partnership" TO BUSINESS-TYPE
WHEN 6 MOVE "Trust/Estate" TO BUSINESS-TYPE
WHEN OTHER MOVE 0 TO BUSINESS-TYPE
END-EVALUATE
DISPLAY "The business type is " BUSINESS-TYPE
GOBACK.

In this program, the user will input from 1 to 6, and each will correspond to a type of business. The EVALUATE statement will then go to the number selected and change the value of BUSINESS-TYPE. The last condition is WHEN OTHER, which is the default value if the user selects something that is not within the range of values.

After a user selects something, the program will go to the END-EVALUATE statement, and the DISPLAY statement will be executed.

The EVALUATE structure can involve complicated logic. Because of this, it can be a good idea to first create a decision table. For example, let’s say we are creating a commission structure like the one shown in Figure 4-7.

This will make it easier to put together the conditional logic:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 COMMISSIONS PIC 99 VALUE ZERO.
 88 UNDER-QUOTA VALUE 0 THRU 10.
 88 QUOTA VALUE 11 THRU 30.
 88 OVER-QUOTA VALUE 31 THRU 99.
PROCEDURE DIVISION.
DISPLAY "Enter the number of units sold"
ACCEPT COMMISSIONS
EVALUATE TRUE
WHEN UNDER-QUOTA
DISPLAY "Commission is 10% and this is under the quota."
 WHEN QUOTA
DISPLAY "Commission is 15% and this meets the quota."
 WHEN OVER-QUOTA
DISPLAY "Commission is 20% and this is over the quota."
 WHEN OTHER
DISPLAY "This is the default"
END-EVALUATE.
GOBACK.

In the DATA DIVISION, we use a group field that has 88 level numbers. Three ranges of commissions can be earned by a salesperson. In the PROCEDURE DIVISION, we set up a structure that has EVALUATE TRUE and then specifies the three conditions for UNDER-QUOTA, QUOTA, and OVER-QUOTA. If the user inputs 15, the message for QUOTA will be executed, and so on.

With the EVALUATE statement, it’s possible to include compounded conditions as well:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 BUSINESS-NUMBER   PIC 99 VALUE ZERO.
01 VIP-CUSTOMER      PIC X.
01 UNITS             PIC 9(3).
01 DISCOUNT          PIC 9(2)V9(2).
PROCEDURE DIVISION.
DISPLAY "Enter the number of units sold"
ACCEPT UNITS
DISPLAY "A VIP customer (Y/N)?"
ACCEPT VIP-CUSTOMER
EVALUATE UNITS ALSO VIP-CUSTOMER
 WHEN 1 THRU 20 ALSO "Y"
  MOVE .20 TO DISCOUNT
 WHEN 21 THRU 50 ALSO "Y"
  MOVE .25 TO DISCOUNT
 WHEN GREATER THAN 50 ALSO "Y"
  MOVE .30 TO DISCOUNT
END-EVALUATE
DISPLAY "The discount is " DISCOUNT

By using the keyword ALSO, we can string together different conditions. In our example, this includes one for the units sold and whether a customer is part of a VIP program.

COBOL allows you to have nested IF/THEN/ELSE blocks. While there are no limits on how many you can have, it’s usually best to not go beyond three. Otherwise, the code could be extremely hard to track.

Here is an example of a nested IF/THEN/ELSE block:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 USERNAME		PIC X(20).
01 PASSWORD		PIC X(20).
PROCEDURE DIVISION.
DISPLAY "Enter your user name"
ACCEPT USERNAME
DISPLAY "Enter your password"
ACCEPT PASSWORD
IF USERNAME = "Tom68"
 IF PASSWORD = "12345"
  DISPLAY "Login successful!"
 ELSE
	 DISPLAY "Incorrect password."
 END-IF
ELSE
 DISPLAY "Incorrect user name."
END-IF.
GOBACK.

The first condition is for USERNAME. If it is correct, the next condition will be triggered. If the condition is not correct, the ELSE at the bottom will be executed. The next condition checks for the password.

When putting together nested IF/THEN/ELSE conditions, it is important that you line up the code properly and terminate each block with an END-IF. If not, the code will likely give wrong results. For example, if we left out the END-IF in the nested condition, the first ELSE would be executed if the USERNAME is not correct.

Loops

The loop is a part of every language. As its name indicates, this structure allows you to iterate through something, such as a dataset. In COBOL, the command for a loop is PERFORM, which has several variations.

The first one we will look at is PERFORM TIMES, which is similar to a for loop in other languages. This means it is executed a fixed number of times, which can be expressed as a field or a literal:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 COUNTER PIC 9(1) VALUE 0.
PROCEDURE DIVISION.
PERFORM 5 TIMES
ADD 1 TO COUNTER
 DISPLAY "Loop number " COUNTER
END-PERFORM
GOBACK.

This will loop five times, and the COUNTER field will be incremented by one for each pass. The value will be printed.

Another way to use PERFORM is to loop a paragraph or subroutine:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 COUNTER PIC 9(1) VALUE 0.
PROCEDURE DIVISION.
100-PARAGRAPH-LOOP.
 PERFORM 200-PRINT-COUNTER 5 TIMES.
 GOBACK.
200-PRINT-COUNTER.
 ADD 1 TO COUNTER
 DISPLAY "Loop number " COUNTER.

This program does the same thing as the first one, but it has a modular structure. We have two paragraphs. The first one will use PERFORM to loop through the second paragraph.

Note

COBOL does have a GOTO command that can call a paragraph. But using it is a bad idea, primarily because the command does not return you to where you called it. Because of this, the code can get chaotic—becoming more like “spaghetti code.”

Next, we can set a condition for a loop. For example, we can use PERFORM UNTIL just as we use the while loop in other languages:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 COUNTER PIC 9(1) VALUE 0.
PROCEDURE DIVISION.
PERFORM UNTIL COUNTER >= 5
ADD 1 TO COUNTER
	DISPLAY "Loop number " COUNTER
END-PERFORM
GOBACK.

This program will count from 1 to 5 and then stop. But you have to be careful with PERFORM UNTIL. If COUNTER is already over 5, looping will not happen.

But there is an alternative: PERFORM WITH TEST AFTER. This guarantees that there will be at least one loop. This structure is similar to the do-while loop in other languages. So with our program, we can do the following:

PERFORM WITH TEST AFTER UNTIL COUNTER >= 5
ADD 1 TO COUNTER
 DISPLAY "Loop number " COUNTER
END-PERFORM

With this, even if COUNTER is over 5, the code will be executed and COUNTER will be incremented by 1.

Next, PERFORM VARYING is essentially a variation of the traditional for loop. But there are some important differences. COBOL allows the use of three counting fields for the loop, the testing of the condition can be before the loop is performed or after, and the condition for the termination of the loop does not have to be COUNTER.

That’s a lot. So to get an understanding of this structure, let’s look at an example:

DATA DIVISION.
WORKING-STORAGE SECTION.
01 YEAR PIC 9(2) VALUE 0.
01 BALANCE PIC 9(4) VALUE 1000.
PROCEDURE DIVISION.
PERFORM VARYING YEAR FROM 1 BY 1
 UNTIL YEAR > 10
 COMPUTE BALANCE = BALANCE * 1.05
 DISPLAY "Balance is $" BALANCE
END-PERFORM.
GOBACK.

This program shows how the value of a $1,000 investment will grow over a 10-year period. We set BALANCE to 1000 and YEAR to 0, which will be incremented by 1 until year 10. For each iteration, a COMPUTE statement will be used to add to BALANCE by an interest rate of 5%.

What if we set a PIC 9 instead for YEAR? The loop would never get to 10. In fact, this would create an infinite loop and crash the program. Again, this is why it is extremely important to think through the use of the data in your programs.

Next, with our program, we can do the looping in reverse, with the following changes to the PROCEDURE DIVISION:

PERFORM VARYING YEAR FROM 10 BY -1
 UNTIL YEAR <= 0
 COMPUTE BALANCE = BALANCE * 0.95
 DISPLAY "Balance is $" BALANCE
END-PERFORM

The value of BALANCE will be reduced for 10 years until the value gets to $595.

And you can use the PERFORM VARYING structure to call a subroutine. Let’s take a variation of the prior code sample:

PROCEDURE DIVISION.
100-BALANCE-LOOP.
PERFORM 200-DEPOSIT-CALC VARYING YEAR FROM 1 BY 1
 UNTIL YEAR > 10
 GOBACK.
200-DEPOSIT-CALC.
 COMPUTE BALANCE = BALANCE * 1.05
 DISPLAY "Balance is $" BALANCE.

Finally, you can use the THRU command with PERFORM to invoke a set of paragraphs:

PROCEDURE DIVISION.
100-FIRST-PARAGRAPH.
 PERFORM 200-SECOND-PARAGRAPH THRU 400-FOURTH-PARAGRAPH.
200-SECOND-PARAGRAPH.
 DISPLAY 'Paragraph 2'.
300-THIRD-PARAGRAPH.
 DISPLAY 'Paragraph 3'.
400-FOURTH-PARAGRAPH.
 DISPLAY 'Paragraph 4'.
GOBACK.

As you can see, PERFORM THRU will execute the 200-SECOND-PARAGRAPH, 300-SECOND-PARAGRAPH, and 400-FOURTH-PARAGRAPH—in this order.

Note

IBM has great resources on COBOL. Check out their Programming Guide as well as the Language Reference.

Conclusion

We have certainly covered a lot in this chapter. You’ve learned the main types of structures you need to know to build or edit a COBOL program. To be a successful programmer, you do not need to know the whole command set. Some commands are duplicative, and others are rarely used.

While COBOL has many similarities to modern languages, there are still some major differences. And yes, the language can be wordy, but this is by design.

In this chapter, we were able to cover the types of math you will need to know, whether through the use of commands like ADD and SUBTRACT or the use of COMPUTE, which provides more versatility. Then we covered how to employ conditionals in COBOL programs with the IF/THEN/ELSE structure. We also looked at more sophisticated decision statements like EVALUATE.

Finally, we saw how to use loops, such as with the PERFORM command. We also showed how to use this to enforce structured programming approaches.

In the next chapter, we’ll look at file handling.

Get Modern Mainframe Development now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Chapter 4. The COBOL Language

COBOL’s Background, in Brief

Note

Note

COBOL Versions

Why Use COBOL?

COBOL Program Structure: Columns

Figure 4-1. The layout of 80 columns for COBOL code

COBOL Program Structure: Divisions

IDENTIFICATION DIVISION

Note

ENVIRONMENT DIVISION

DATA DIVISION

WORKING-STORAGE SECTION

Level number

Note

Field name

PIC clause

Figure 4-2. Examples of how to use PIC clauses

USAGE clause

VALUE clause

The data group

Special level numbers

FILE-SECTION

Constants

REDEFINES command

PROCEDURE DIVISION

Note

MOVE command

Figure 4-3. If the field has fewer characters than allocated, COBOL adds spaces at the end

Figure 4-4. If a string is too big for a PIC, the extra characters are truncated

Figure 4-5. If a number is smaller than allocated by the PIC 9 declaration, 0s will be added from left to right

Figure 4-6. If a number is larger than the PIC 9 allocation, the extra numbers will be truncated

Math commands

ADD, SUBTRACT, MULTIPLY, and DIVIDE

COMPUTE

Note

Math Functions

Conditionals

Figure 4-7. A decision table can be helpful when creating conditionals

Loops

Note

Note

Conclusion

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly

Figure 4-2. Examples of how to use `PIC` clauses

Figure 4-4. If a string is too big for a `PIC`, the extra characters are truncated

Figure 4-5. If a number is smaller than allocated by the `PIC 9` declaration, 0s will be added from left to right

Figure 4-6. If a number is larger than the `PIC 9` allocation, the extra numbers will be truncated