Do you need to retrieve data from variable-length fields of deleted records?
The data of variable-length fields of records deleted from a file can be retrieved provided that the inserts and updates performed in the file after the records were deleted didn't affect the variable part of the file.
In the first part of this article, we saw that, when a physical file is saved to a save file, the fixed and the variable parts of every record—that is, its records A and B—are stored separately in areas A and B of the save file. We also described the format of records A and B and saw that the data stored in area A of the save file could be retrieved for the deleted records.
In the second part of the article, we will see how the system manages the data of the variable part of the records and determine under which conditions this data can be retrieved for the deleted records.
Effect of File Operations on Records B
When a record is deleted from a file, the only field changed in its corresponding record B is RRN1, which is set to 0 (see section "Format of Records B" in the first part of the article). In addition, the space taken up by this record B is marked as available so that it can be reused to store data for another record of the file. This space is marked as available by adding an item containing its length, segment number, and segment offset to an array located in an area of the save file that we will call area 14. As the only change that needs to be done in area B when a record is deleted is to set field RRN1 to 0, the system can delay this change until another access to area B is requested.
When a record of a file is updated, the new values that are to be stored in area B are kept in the current record B if it's long enough to hold them. If the new values don't fit into this record B, the system looks for an available record B (that is, a record B whose length and location are stored in area 14) long enough for the new values. If such a record B is found, the system reuses it to store the new values. If it's not found, a new record B is added at the end of area B. Finally, if the original record B could not be used to store the new values, its field RRN1 is set to 0 and the space it takes up is marked as available.
For example, let's insert the two following records into physical file FILE2:
A R RFILE2
A FLD1 1A
A FLD2 30A VARLEN
A FLD3 30A VARLEN
A FLD4 1A
insert into QTEMP/FILE2 values
('1', 'AAA', '12345', '0'),
('2', 'BBB', '12345', '0')
The two corresponding records B are the following:
Area B (save file record 57):
Bk___ DataBk nnnn RRN wk AccumLenk
E EndOfRecord
RRN From To
*...+....1....+....2....+....3....+....4....+....5 57 1 50
© u 4ze¦ AAA12345
01000B0A2FA86000000000000000000000000000CCCFFFFF00
030834040495A0000100000001000000000103081111234500
B1_B2___
nnnnw1w2 nn
....+....6....+....7....+....8....+....9....+....0 51 100
Ë BBB12345 Ë
00700000000CCCFFFFF0000700000000000000000000000000
01300020308222123450002300000000000000000000000000
B1_B2___
nnEnnnnw1w2 nnnnE
In the following update, the new values for fields FLD2 and FLD3 fit into the current record B:
update QTEMP/FILE2
set FLD2 = 'NEWAAA', FLD3 = '7'
where FLD1 = '1'
Area B:
*...+....1....+....2....+....3....+....4....+....5
© u 4ze¦ NEWAAA75
01000B0A2FA86000000000000000000000000000DCECCCFF00
030834040495A0000100000001000000000106075561117500
....+....6....+....7....+....8....+....9....+....0
Ë BBB12345 Ë
00700000000CCCFFFFF0000700000000000000000000000000
01300020308222123450002300000000000000000000000000
When a record is deleted, the field RRN1 of its corresponding record B is set to 0 and the space taken up by this record B is marked as available:
delete from QTEMP/FILE2
where FLD1 = '1'
Area B:
*...+....1....+....2....+....3....+....4....+....5
© u 4ze¦ NEWAAA75
01000B0A2FA86000000000000000000000000000DCECCCFF00
030834040495A0000100000001000000000006075561117500
....+....6....+....7....+....8....+....9....+....0
Ë BBB12345 Ë
00700000000CCCFFFFF0000700000000000000000000000000
01300020308222123450002300000000000000000000000000
Area 14 (save file record 65):
ll length ss segment number oooo segment offset
*...+....1....+....2....+....3....+....4....+....5
m u 4ze¦
0100090A2FA860000000000000000000010000020000000000
040834040495A0000100000000000000050100000000000000
llssoooo
In the next case, the new values for the record don't fit into its record B and the only record B available is not long enough for the new values, so a new record B is added at the end of area B:
update QTEMP/FILE2
set FLD2 = 'NEWBBB'
where FLD1 = '2'
Area B:
*...+....1....+....2....+....3....+....4....+....5
© u 4ze¦ NEWAAA75
01000B0A2FA86000000000000000000000000000DCECCCFF00
030834040495A0000100000001000000000006075561117500
....+....6....+....7....+....8....+....9....+....0
Ë BBB12345 Ë NEWBBB12345 Ë
00700000000CCCFFFFF0000700000000DCECCCFFFFF0000700
0130000030822212345000230002060B556222123450002300
Area 14:
*...+....1....+....2....+....3....+....4....+....5
m u 4ze¦
0100090A2FA860000000000000000000010000020100000300
040834040495A0000100000000000000050100000501000500
Let's delete the record we have just updated:
delete from QTEMP/FILE2
where FLD1 = '2'
Area B:
*...+....1....+....2....+....3....+....4....+....5
© u 4ze¦ NEWAAA75
01000B0A2FA86000000000000000000000000000DCECCCFF00
030834040495A0000100000001000000000006075561117500
....+....6....+....7....+....8....+....9....+....0
Ë BBB12345 Ë NEWBBB12345 Ë
00700000000CCCFFFFF0000700000000DCECCCFFFFF0000700
0130000030822212345000230000060B556222123450002300
Area 14:
*...+....1....+....2....+....3....+....4....+....5
m u 4ze¦
0100090A2FA860000000000000000000010000020100000301
040834040495A0000100000000000000050100000501000508
....+....6....+....7....+....8....+....9....+....0
[
00000400000000000000000000000000000000000000000000
01000A00000000000000000000000000000000000000000000
When we insert the following record, the system reuses an available record B to store the fields FLD2 and FLD3. Note that the item in area 14 corresponding to this record B is initialized to 0.
insert into QTEMP/FILE2 values
('3', 'CCC', '123456', '0')
Area B:
*...+....1....+....2....+....3....+....4....+....5
© u 4ze¦ NEWAAA75
01000B0A2FA86000000000000000000000000000DCECCCFF00
030834040495A0000100000001000000000006075561117500
....+....6....+....7....+....8....+....9....+....0
Ë BBB12345 Ë CCC12345645 Ë
00700000000CCCFFFFF0000700000000CCCFFFFFFFF0000700
01300000308222123450002300030309333123456450003300
Area 14:
*...+....1....+....2....+....3....+....4....+....5
m u 4ze¦
0100090A2FA860000000000000000000010000020100000300
040834040495A0000100000000000000050100000501000500
....+....6....+....7....+....8....+....9....+....0
00000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000
Area 14
Area 14, as it happens for areas A and B, consists of the positions 1 to 512 of a group of consecutive records of the save file. From its position 33, it stores an array of items 8 bytes long containing the length and location of records B whose space has become available and can be reused by the system. Figure 1 shows the format of these items:
Figure 1: This is the format of an item of area 14.
When an available record B is reused by the system to store data for a record inserted or updated in the file, its corresponding item in area 14 is initialized to 0.
To identify the record of the save file where area 14 begins, we'll use the fact that positions 7 to 18 of every segment of every area of a given member are the same. Beginning at the record of the save file that follows the first record of area B, we will look for a record of the save file whose position 2 is equal to x14 and whose positions 7 to 18 are equal to positions 7 to 18 of the first segment of area B.
The end of area 14 is determined by the beginning of the following area. This area begins at the following record of the save file whose position 2 is equal to x15 and whose positions 7 to 18 are equal to positions 7 to 18 of the first segment of area B.
Area 14 seems to have a maximum size, at least in many cases. Currently, this maximum size is 16MB—that is, one segment—so area 14 can store up to (16MB – 32 bytes) / 8 = 2,097,148 items. If this number of items is reached, the system will ignore the requests to store new items in area 14 until there is some free space—that is, until some of the available records B are reused by the system and their corresponding items in area 14 are initialized to 0.
Retrieving Data from Records B
The record B that an item of area 14 points to can be found in the record of the save file and at the position that is shown in Figure 2:
Figure 2: These are the record of the save file and the position where record B begins.
Field RRN2 of record B allows us to identify the corresponding record of the physical file and the corresponding record A in area A. Field Status of this record A will tell us whether the record of the physical file was deleted or not.
As we saw in section "Effect of File Operations on Records B", there can be several records B containing data for the same record of the physical file, one of them corresponding possibly to the insert of the record in the file and the others to some of the updates of the record. Taking into account how the system manages records B, we can get to the conclusion that the record B containing the latest data is the longest one.
In order to retrieve the data of the fields stored in area B for a record R that has been deleted from a file, we need to be certain that the latest record B—that is, the record B containing the latest data of record R—is present in area B. If some other record has been inserted or updated in the file after record R was deleted, and this insert or update required a record B, the system may have reused the latest record B of record R.
If the system did reuse the latest record B and there are another records B corresponding to record R, it would be incorrect to consider one of these records B as the latest one. For example, when record no. 3 was inserted into FILE2 in section "Effect of File Operations on Records B", the system reused the record B containing the latest data of record no. 2 ("NEWBBB", "12345") to store the data for record no. 3 ("CCC", "123456"). The data from the other record B corresponding to record no. 2 ("BBB", "12345") is not the data record no. 2 had when it was deleted.
The only case where we can know that the latest record B has not been reused by the system is when its field RRN1 is different from 0. As we saw in section "Effect of File Operations on Records B", this happens when a record has been deleted from the file and the system has delayed setting to 0 the field RRN1 of the corresponding record B.
As a consequence, for each insert or update requiring a record B that has been performed in a file after the records were deleted, there may be a deleted record whose data in area B cannot be retrieved or can be retrieved but is not the latest one.
Another limitation to the retrieval of data from area B for a deleted record comes from the fact that area 14 has a maximum size. If an item cannot be stored in area 14 because this area has reached its maximum size, the location of the record B contained in that item is lost.
Finally, we must take into account that the fields whose data can be retrieved from the latest record B are only those whose corresponding length in record A is equal to x8000. A length of a field different from x8000 implies that the latest data for that field was not stored in the corresponding record B (see VarFieldLen in section "Format of Records A" in the first part of the article).
GETDLTRCD Command
The Get Deleted Records (GETDLTRCD) command is an implementation of the concepts seen in this article. It can be downloaded from the MC Press web site in English and Spanish. I would be grateful for your feedback on the command, especially if some error is found.
LATEST COMMENTS
MC Press Online