Add Power and Flexibility to Your Programs
By Roger Pence
Can you imagine changing a flat tire on your car without a jack? It could be done-but it would take a lot of brute strength or an extremely clever use of leverage.
That's the way tables and arrays are in RPG. You could probably use brute force or cleverness to code around a problem which has an obvious table/array solution. But, the table/array solution would certainly be easier, require less code, and be easier to maintain.
In the next two issues, we'll take a look at tables and arrays. In this issue, we'll cover the basics of tables and arrays. Our basic coverage will include the differences between tables and arrays, referencing the data in tables and arrays, and loading data into tables and arrays. Next month we'll take a more advanced look at tables and arrays, looking at array-specific operation codes, ways to search tables and arrays, ways to use arrays for string manipulation, and several other tips and techniques for array/table processing.
What Are Tables and Arrays?
A table or an array is an organized arrangement of entities with similar qualities. An egg crate, holding a dozen eggs, is a an array of eggs. To the cook, the egg crate offers easy access to one egg out of a dozen, or to the entire dozen. To the RPG programmer, an array is an egg crate full of data-offering easy access to one piece of data out of many, or to the entire group of data.
Each piece of data in an array or table is called an element. An RPG table or array can have up to 9,999 elements. It sometimes helps to understand tables and arrays by thinking of their elements as rows and columns. In RPG, tables and arrays are limited to single-dimensional arrays-one array can represent one row and many columns or one column and many rows. Next month's article will show a work-around for simulating multiple-dimensioned arrays-many rows and many columns. Given this RPG array limitation, the egg crate analogy breaks down a little because an egg crate really represents a multiple-dimensioned array (two rows, each with six elements). The array below is an example of a 5- element RPG array representing one row with five columns (note the fifth element is being used to store the sum of the first four):
Element #1 #2 #3 #4 #5 Value 6 10 12 8 36
When your program is using many similar pieces of data, using tables and arrays will almost always result in much less code to manipulate that data. And, requiring less code means the program will be easier to write, debug, and maintain. We'll discuss op codes specific to arrays later, but to give you an example of how powerful arrays are, look at the code below to see how easy it would be to put the sum of the first four elements of the array above into the fifth element (this code assumes the array is named NUM and that the fifth element has a value of zero):
C XFOOTNUM NUM,5
Note the ',5' following NUM in the result field above. This is called array indexing and is used to reference an individual element of an array. The line of RPG code above says, "Sum the values (crossfoot in accountantese-that's where the clever op code name comes from) in array NUM and put the result in the 5th element of NUM." We'll discuss array indexing and array specific op codes in more detail later.
Each element of an RPG table or array:
* can either be all numeric or all alphameric strings, but not a mix of both.
* must be the same length.
* must (if the array is a numeric array) have the same number of decimal places.
Also note that RPG does not directly permit the contents of an array element to be a data structure or another array. In next month's article, we'll take a look at a couple of ways around this constraint.
RPG also provides support for related (or sometimes called alternating) tables and arrays. Many times, two tables or two arrays work very closely with each other. That is, one element of one will provide additional information about the corresponding element in the other. For example, one table might contain state abbreviations, while a related table contains sales tax percentages for the corresponding state. RPG allows one array to be related to another array, or one table to be related to another. But, tables cannot be related to arrays, and conversely, arrays cannot be related to tables.
As we'll see shortly, tables and arrays are defined to the RPG program using the E spec. The E spec defines the size of the table or array, the attributes of its elements, how the array is loaded, and if the table or array has a related table or array. We'll review specific parts of the E spec and how it relates to the task at hand as we go along. For a complete review of the E spec, see Chapter 22 of IBM's Programming with RPGII manual.
What's the Difference Between Tables and Arrays?
In RPG, tables and arrays have very similar qualities. So similar in fact, that in some cases, using one is just as good as using the other. But the following differences do apply:
Array processing allows each array element to be referenced and worked with separately, or you can work with all elements at once. Tables, on the other hand, require that you work with each table element separately-you cannot reference or process all elements of a table at once.
Array and table names can be up to six characters long and follow the rules for regular field names, with the exception that table names must begin with the characters TAB, and array names must not start with TAB. You'll also see shortly that it is generally best to limit array names to four characters.
Unlike arrays, tables cannot be loaded with data while the program is running-table data must be loaded as the program is being compiled or as the program is being loaded (more on loading tables and arrays with data in a moment).
You'll probably use arrays more than you use tables. Arrays are handy for storing and manipulating often-changing information, while tables are generally more suited for use as look-up devices for generally static information.
Referencing an Individual Array Element
Each individual element of an array may be directly referenced and processed, or the entire array may be processed (thereby doing something with all the elements at once). The mechanism provided to reference individual elements of an array is called array indexing.
To code an array index, add a comma and the array index immediately after the array name. Indexing can be done directly, using an explicit reference to the appropriate element, with a whole number, as shown in the example below:
C XFOOTNUM NUM,5
or it can be done indirectly, using another field to vary the reference to the appropriate element:
C Z-ADD5 X C XFOOTNUM NUM,X
Array indexing is a big part of what gives array processing its power. Using a simple loop and indirect indexing, a lot of numbers can be very quickly crunched and with little code. For example, assume you needed to check every element of a 500 element, numeric array-performing some code for only those elements that have a value greater than zero. Here's the code (remember, in this construction that RPG will automatically increment the value of X upon each iteration of the loop):
C DO 500 X C NUM,X IFGT 0 C..PERFORM A SUBROUTINE C END C END
When you use indirect array indexing, you are responsible to see that the index value is always within bounds. The index must always be greater than 0 but never greater than the number of elements in the array being referenced. Also be sure the indirect index field is declared to be a numeric field with no decimal places. Out-of-bounds array indexing errors can be very devilish to find and often lay dormant for a long time before the right set of circumstances pushes the index out of bounds. Code your indirect array indexes defensively!
Do you know where the best example of an infinite loop can be found? On the back of a shampoo bottle: Rinse, Lather, and Repeat. Now, most of us know to break out of that loop when our hair is clean. If only RPG were that smart. When you code loop processing as shown above, you must ensure that the size of the index is large enough to hold the loop's maximum value. If X had been defined above as a 2 digit integer, the loop would never finish. (In RPG, if X is defined as two digits and its value is 99, adding 1 to it makes it 0!)
As if RPG's six character limit on field names were not enough of a constraint, an array name should generally never be more than four characters long. Anything longer than four characters would not allow room for array indexing in those RPG specs that provide only six characters for a field name (primarily the result field in the calc specs and output field in the output specs). Remember too, that an array name must not start with TAB. Given the six spaces available for indexed array names, the index name must also be arbitrarily short (generally one or two characters maximum).
Referencing an Entire Array
As previously mentioned, not only can an individual element of an array be processed, but an entire array can also be processed at once. For example, you might want to set every element of numeric element to zero. Now, certainly, you could use loop processing to do that, but there is a better way:
C Z-ADD0 NUM
What could be easier than that! When we discuss op codes and how they work with arrays and tables, we'll see other examples of manipulating all elements of an array.
Referencing an Element of a Table
The referencing of individual elements of a table is not done with indexing, it is done with help from the LOKUP op code. An "active" element is maintained for each table. Until a search is performed with LOKUP, the first element in the table is in this active element. After a successful search, the value found by the search is moved into this active element. If a search is not successful, the value in the active element is what it was prior to the search.
Using the LOKUP operation with tables:
* factor 1 defines the "search word", the value being looked up.
* factor 2 defines the table name in which the value is being looked up.
* and factor 3 defines an optional, related table. Factor 1, the search word, must have the same attributes.
How the look-up is performed is based on the use of the resulting indicators that must be used with LOKUP.
* To find a value in the table that this equal to the search word, use an indicator in columns 58-59.
* To find a value in the table that is less than the search word, use an indicator in columns 56-57.
* To find a value in the table that is greater than the search word, use an indicator in columns 54-55.
The specified indicator is turned on if the search was successful, otherwise it will be turned off. At least one of the above conditions, and no more than two, must be used to condition the LOKUP op code.
Using LOKUP also works with arrays, and we'll see more interesting ways of using it later when we discuss the various other ways to search arrays and tables.
Loading Tables and Arrays with Data
Data can be loaded into a table or array in one of three ways:
* it can be loaded implicitly from statements in the source code (compile time loading).
* it can be loaded implicitly from a disk file prior to program run-time (preexecution time loading).
* it can be loaded explicitly with your RPG code from the calc or input specs (execution time loading)
The first two methods work for tables and arrays, while the third only works for arrays. Let's take a closer look at each.
Compile Time Loading of Tables and Arrays
Compile-time tables or arrays are often used for error messages, table-lookups, and other, generally static purposes. They are loaded with data that is included in the source code of the program. The actual contents of the array or table are pulled from the source code and included in the load member as the program is compiled.
Compile-time table and array data always follow the last output spec in your program and is always preceded by a single line with two asterisks beginning in position 1. More than one compile-time table or array can be used in one program, just be sure to start each table's or array's data with the two asterisks statement, and be sure to place the data in the same order in which the tables and/or arrays are defined in the E specs.
Program ARR1, in 1, shows a program using two compile time arrays. ARR1's E spec says:
Program ARR1, in Figure 1, shows a program using two compile time arrays. ARR1's E spec says:
* that array AMS (columns 27-32) has one entry per record (columns 33-35--note that a record here refers to a line of source code), and has 4 elements in it (columns 36-39), each of which are 20 characters long (columns 40-42). This is an alphameric array because no decimal places are defined in column 44.
* that array AMG (columns 27-32) also has one entry per record (columns 33-35), and has four elements in it (columns 36-39), each of which are 35 characters long (columns 40-42).
The data that fills these arrays is coded at the end of the source code, after the last output spec. Note how each E spec's array data is preceded by two asterisks. Also note that this line with two asterisks can be used as a comment line. In programs with several compile time tables or arrays, comments here really come in handy.
To see the program in listing 1 in action, load it with the OCL:
// LOAD ARR1 // RUN
You will be prompted for an index number. Enter a number between one and four, press field exit, and watch ARR1 display the lead singer with his group (do you remember Dicky Doo and the Don'ts?). Use 999 to end the program.
To see what happens when an array index is out of bounds, enter a number greater than four when prompted for the index value. A message will be displayed that says the array index is out of bounds. This is a run-time error you must avoid. If a zero is taken to this message, the index is reset to one and all kinds of unpredictable things can happen. Test! Test! Test! and write your code defensively.
Using compile-time arrays and tables makes referencing text and numeric data easy because the data actually becomes a part of the program. But this compile- time data increases the size of your program, and changing it requires recompiling your program. Consider these trade-offs as you use compile-time tables and arrays.
PreExecution-Time Table and Array Loading
Preexecution tables and arrays are loaded by the RPG program, from a disk input file, just before the program starts running. These tables and arrays are most often used to pull in generally static data from small disk files for repeated look-ups. This method does not require any RPG code to load the data into the table or array. Rather, the data is loaded, from the file into the table or array, automatically.
The disk file containing the preexecution data must be defined as an Input Table file on its F spec (IT in columns 15-16). Although I have used indexed files for this purpose, the RPG manual says these files must be sequential files. So, its probably more reliable to stick with the manual's advice and use only sequential files for this purpose.
Consider a demographic application that needs to know the approximate square miles of a given state. Rather than performing disk I/O each time the square miles is required, consider loading the data into a preexecution time table. Then, with the state abbreviation, use the LOKUP opcode to look up the square miles as needed. Because this type of data will not change very often and has a finite size, it is an example of an application very well suited to preexecution table processing. By eliminating disk I/O, this technique is often especially helpful for those applications that use repeated, disk-intensive look-ups.
Program ARR2, in 2, shows a program using a preexecution-time table. Its E spec defines two alternating tables, loaded with data from the sequential file, ARRTEST1 (see the sidebar for help in making the test file ARRTEST1). This E spec says:
Program ARR2, in Figure 2, shows a program using a preexecution-time table. Its E spec defines two alternating tables, loaded with data from the sequential file, ARRTEST1 (see the sidebar for help in making the test file ARRTEST1). This E spec says:
* that disk file ARRTEST1 will be used to fill two tables, TABST and TABSM (columns 27-32 and 46-51).
* that each record of ARRTEST1 contains one element of each table (columns 33- 35).
* to reserve space for a maximum of 50 elements in each table (columns 36-39).
* that each element of table TABST is two characters long (columns 40-42).
* and that each element of TABNAM is five digits long with no decimal places (columns 52-54 and column 56).
If columns 36-39 don't allocate enough entries to hold all the data in the disk file, an error message is displayed. Code the number of entries for preExecution-time tables very defensively-being sure to allow enough entries for the number of table elements that will be read.
To run ARR2 (after preparing the test file, ARRTEST1), use the OCL:
// LOAD ARR2 // FILE NAME-ARRTEST1 // RUN
After prompting you for a state abbreviation, ARR2 uses the LOKUP op code to find that abbreviation in the TABST table. If the abbreviation is found (if 31 is on) the corresponding square miles from the TABSM table is displayed, otherwise a "not found" message is displayed. End the program by entering two X's instead of a state abbreviation.
One of the big drawbacks of using preexecution-time tables and arrays is that you don't get to define where in the record the data is located-the data must always start in column 1 and proceed sequentially from there. Therefore, it almost always takes a specially prepared disk file to load preexecution-time tables and arrays. Many times though, the overhead savings is worth it-just be aware that you can't just take any old file and use it to load preexecution- time tables and arrays.
Loaded From Input or Calculation Specs (Execution Time)
Arrays can also be loaded with data from within your RPG program. But, unlike 8the previous two methods, execution time arrays are not implicitly loaded with data for you. You must take explicit steps, in the input or calc specs, to load data into the array. This can be done by defining an array or an array element in the I spec, or by moving data into the array (or array element) in the calc spec. Remember that this method will not work for tables.
Program ARR3, shown in 3, shows one example of loading arrays using execution time processing. Its E spec says:
Program ARR3, shown in Figure 3, shows one example of loading arrays using execution time processing. Its E spec says:
* that array ANM has four elements (columns 36-39), each of which are five digits long with no decimal places (columns 40-42 and column 44).
* and that array ATL also has four elements, each of which is also five digits with no decimal places.
Note the number of entries per record, in columns 33-35 is not used with execution time arrays.
Program ARR3 loads all four elements of ANM by virtue of reading records in the test file, ARRTEST2 (see the sidebar for help in making ARRTEST2). In ARRTEST2 records, there are four 5-digit, no decimal numbers that start in position 6. Much like the way an input spec defines a field, line 9 of the input spec defines the array ANM, beginning in position 6 and ending in position 25. When a record is read, the elements in ANM are made available to the program. When a different record is read, its array values take the place of the previous records values (the same way the value of field NAME would replace its previous value when a new record is read).
When arrays are loaded like this, it's important the array defined on the input specs not be larger than the array declared in the E specs. The length of the array defined in the I spec must be equal to, or less than the length of the array as it is defined in the E spec. In the example, the E spec says that ANM has four 5-digit elements. Therefore, the input spec must define no more than 20 bytes (4*5) as array ANM. Packed and binary disk file data can also be loaded into an array in this fashion, but once again, make sure the input specs match the array definition.
To run ARR3 (after preparing the test file, ARRTEST2), use the OCL:
// LOAD ARR3 // FILE NAME-ARRTEST2 // RUN
ARR3 will display array elements from the records in ARRTEST2, and then display the totals of those records. Press ENTER to end the program.
After a record is read from ARRTEST2, the array element values are accumulated in the array ATL. This is done with the one instruction in line 12. Many calc spec op codes allow you to work with an entire array or with individual elements of the array. In this example, line 12 causes every element of ANM to be added to the corresponding elements in ATL. After all records from ARRTEST2 have been accumulated and displayed, the totals accumulated in ATL are displayed.
When the ANM elements are displayed (in lines 17-20), the output specs explicitly define the ending position of each element. But, when the ATL array is displayed (in line 23) , ARR3 demonstrates an optional way of outputting array elements. When an array name-with no index-is specified for output, and when an edit code is used, RPG automatically separates each element with two spaces. This little feature really makes coding the output specs on those 12- month spreadsheets easy. If no edit code is specified, all array elements are output with no intervening spaces.
Program ARR4, in 4, shows a time and coding saving twist that array processing provides for you. In ARR3, the elements of ATM were calculated and displayed vertically (the columns were totaled), but the horizontal totals (the totals of the rows), were not calculated.
Program ARR4, in Figure 4, shows a time and coding saving twist that array processing provides for you. In ARR3, the elements of ATM were calculated and displayed vertically (the columns were totaled), but the horizontal totals (the totals of the rows), were not calculated.
To see column and row totals, run program ARR4 with the OCL:
// LOAD ARR4 // FILE NAME-ARRTEST2 // RUN
By adding a "row total" element in ANM and ATL (notice how both have been increased from four elements to five in the E specs in lines 5 and 6), a place has been added to accumulate the missing "row" totals. As previously mentioned, RPG doesn't bark because it's OK for the E spec array to be larger than the array defined on the I spec.
When a record from ARRTEST2 is read, the elements of ANM are filled as they were in program ARR3, but no fifth element is read. Line 12 then uses the previously mentioned XFOOT op code to sum the elements of ANM into the fifth element of ANM. Although ANM,5 should never have a value when read from ARRTEST2, line 11 sets it to zero, just in case. The program works without this line-but whenever an array is XFOOT'ed into an element of itself, this defensive code is probably worthwhile. To keep ARR4 short, both arrays are output with the previously mentioned array output shortcut.
Summary
And there you have it, the basics. This is just the beginning. Check back next month when we cover several more advanced table and array processing techniques.
Roger Pence is a System/36 programmer for Custer Lumber in Marion, Indiana. He also works as an independent consultant for the System/36 and PC networks. Roger has been programming on the System/34/36 for eight years and can be reached at (317) 674-3384.
Creating the Sample Program Files
To run a couple of the sample programs in this article, you'll need to create two test files. The procedure, MAKEARRF, shown below, makes two sequential, 80- byte record length disk files from two source members you will create.
With MAKEARRF's help, you can very easily enter test data into a source member and then very quickly convert that source member into a disk file.
MAKEARRF will create two disk files named ARRTEST1 and ARRTEST2. Be careful that you do not already have files with these names on disk-they WILL BE DELETED before the procedure runs.
First key in MAKEARRF. Then, to make the test files that the sample programs need, enter the following 4 lines into a source member named WK1 in one of your libraries (without serializing line numbers):
....+....1 line 1 IN36291 line 2 MI58216 line 3 OH41222 line 4 IL56400
Create a second source member with the name WK2 that contains the following two lines (again, don't serialize):
....+....0....+....0....+ line 1 Sales00352007560064500879 line 2 Wages00175004430038100615
After creating both members, run:
MAKEARRF (your library name)
The test files, TEST1 and TEST2 are now on disk and ready for the sample programs to use.
Procedure Member MAKEARRF
// IF DATAF1-ARRTEST1 DELETE ARRTEST1,F1 // IF DATAF1-ARRTEST2 DELETE ARRTEST2,F1 // LOAD $MAINT // FILE NAME-WK1,UNIT-F1,RECORDS-250,RETAIN-J // FILE NAME-WK2,UNIT-F1,RECORDS-250,RETAIN-J // RUN // END COPYDATA WK1,,ARRTEST1,,,,,,OMIT,1,EQ,'/' COPYDATA WK2,,ARRTEST2,,,,,,OMIT,1,EQ,'/'
Working With Arrays and Tables, Part I
Figure 1 Program ARR1
Figure 1: Program ARR1 1...+...10....+...20....+....30....+....40....+....50....+....60....+....70 0001 H 064 Col 75> ARR1 0002 FCONIN IP 79 79 KEYBORD 0003 FCONOUT O 79 79 CRT 0004 E AMS 1 4 20 0005 E AMG 1 4 35 0006 C 'INDEX ' KEY X 30 0007 C X IFEQ 999 0008 C SETON LR 0009 C END 0010 C NLR EXCPT@@DISP 0011 OCONOUT E 2 @@DISP 0012 O AMS,X 20 0013 O AMG,X 55 ** AMS-THE SINGER DICKY DOO SAM THE SHAM LITTLE STEVEN RONNIE ** AMG-THE GROUP AND THE DON'TS AND THE PHAROAHS AND THE DISCIPLES OF SOUL AND THE DAYTONAS 1...+...10....+...20....+....30....+....40....+....50....+....60....+....70
Working With Arrays and Tables, Part I
Figure 2 Program ARR2
Figure 2: Program ARR2 1...+...10....+...20....+....30....+....40....+....50....+....60....+....70 0001 H 064 Col 75> ARR2 0002 FCONIN IP 79 79 KEYBORD 0003 FCONOUT O 79 79 CRT 0004 FARRTEST1IT F 80 80 EDISK 0005 E ARRTEST1 TABST 1 50 2 TABSM 5 0 0006 C 'STATE ' KEY STATE 2 0007 C STATE IFEQ 'XX' 0008 C SETON LR 0009 C END 0010 C NLR STATE LOKUPTABST TABSM 31 D-EQ 0011 C NLR EXCPT 0012 OCONOUT E 2 0013 O 31 TABST 5 0014 O 31 TABSM J 35 0015 O N31 35 'NO RECORD FOUND' 1...+...10....+...20....+....30....+....40....+....50....+....60....+....70
Working With Arrays and Tables, Part I
Figure 3 Program ARR3
Figure 3. Program ARR3 1...+...10....+...20....+....30....+....40....+....50....+....60....+....70 0001 H 064 Col 75> ARR3 0002 FCONIN IP 79 79 KEYBORD 0003 FCONOUT O 79 79 CRT 0004 FARRTEST2IF 80 80 DISK 0005 E ANM 4 5 0 0006 E ATL 4 5 0 0007 IARRTEST2NS 0008 I 1 5 NAME 0009 I 6 25 ANM 0010 C READ ARRTEST2 LR 0011 C NLR EXCPT@@DTL 0012 C NLR ADD ANM ATL 0013 C LR EXCPT@@TOT 0014 C LR 'WAIT...' KEY KP 1 WAIT FOR KEYPRESS 0015 OCONOUT E 1 @@DTL 0016 O NAME 5 0017 O ANM,1 L 16 0018 O ANM,2 L 24 0019 O ANM,3 L 32 0020 O ANM,4 L 40 0021 OCONOUT E 1 @@TOT 0022 O 5 'TOTAL' 0023 O ATL L 40 1...+...10....+...20....+....30....+....40....+....50....+....60....+....70
Working With Arrays and Tables, Part I
Figure 4 Program ARR4
Figure 4. Program ARR4 1...+...10....+...20....+....30....+....40....+....50....+....60....+....70 0001 H 064 Col 75> ARR4 0002 FCONIN IP 79 79 KEYBORD 0003 FCONOUT O 79 79 CRT 0004 FARRTEST2IF 80 80 DISK 0005 E ANM 5 5 0 0006 E ATL 5 5 0 0007 IARRTEST2NS 0008 I 1 5 NAME 0009 I 6 25 ANM 0010 C READ ARRTEST2 LR 0011 C NLR Z-ADD0 ANM,5 0012 C NLR XFOOTANM ANM,5 0013 C NLR ADD ANM ATL 0014 C NLR EXCPT@@DTL 0015 C LR EXCPT@@TOT 0016 C LR 'WAIT...' KEY KP 1 WAIT FOR KEYPRESS 0017 OCONOUT E 1 @@DTL 0018 O NAME 5 0019 O ANM L 48 0020 OCONOUT E 1 @@TOT 0021 O 5 'TOTAL' 0022 O ATL L 48 1...+...10....+...20....+....30....+....40....+....50....+....60....+....70
LATEST COMMENTS
MC Press Online