University of Maryland Downdater - Level 3.0 SPACE 18 CENTER University of Maryland Downdater SPACE 1 CENTER Level 3.0 1Introduction: space 1 The @DOWN processor as supplied by the University of Maryland is primarily a symbolic comparator. It is used to compare two 'SDF' (System Data Format: Ref PRM UP4144) 'files' to produce a 'SIR' (SIR correction images: Ref PRM UP4144) type correction deck capable of literally transforming one file into the other. The term 'file' is here used very loosely, in fact, @DOWN will accept program files, data files or program file elements as legitimate input. It also has the ability to make certain superficial comparisons reguarding the total contents of program files and is not necessarily restricted completely to only SDF format elements. The processor can be found to be invaluable in any large programming application where many programmers may be interacting on the same base symbolics. It is also of great use in any application where the user wishes or is required to maintain and update current symbolics with the standard system SIR mechanisms. The processor is also very valuable in program testing and debugging by being able to compare corresponding output streams. In general, there are two main modes of downdater operations, though, in fact, under option control one may change the characteristics of the output or expected input considerably. The first mode may be called element or data file mode, the two differing input types produce effectively the same result. It means comparing one single stream of symbolic input with a second stream to determine any differences. Though the output is in SIR type correction images, the meaning is graphically obvious and a user needs no special training in order to decipher it. The second mode of downdate is called program file mode. In this case one specifies two distict program files and the downdater, under option control, will compare corresponding elements of both files. In the simplest and default case, this means each symbolic element of the one program file will be compared with its corresponding element, of the same name, in the opposing program file. Numerous options are available to allow the user greater flexibility in using @DOWN. Under option control one may: compare the contents of one program file's table of contents ('TOC') against another; produce a PCF (permanent correction file) or TCF (tempory correction file) program file, data file, or program file element (SSG PCF and TCF files: Ref PRM UP4144); specify only a certain part of the images of a specific downdate to be compared; ignore multiple blanks; produce expanded PRINT$ listings; downdate non-standard SDF type files; and control output optimization parameters. In addition, the input character modes of the symbolics (whether FIELDATA or ASCII) is effectively transparent. If mixed mode records occur FIELDATA is the chosen medium for comparison. In summary, the @DOWN processor is to the sophisticated programmer an indispensible tool. He may use it to interrogate the differences between files, or in maintaining programs, perhaps for distribution to other sites, or debugging by comparing test run outputs against some correct base. The processor should not be overlooked by the more novice programmer for its superficial simplicity. In fact, he too may find @DOWN a very useful and necessary tool. space 1 1Processor call card: space 1 center @DOWN, ,..., space 1 2Options: column 5 space 1 'A' - used in program file mode. It forces a comparison of program file TOCs for the occurrences of ABSOLUTE elements. Differences between the information collected from each TOC with respect to absolute elements will be summarized in the PRINT$ output. (also see 'D','O','R','S' options with reguard to 'TOC' summaries) space 1 'B' - used to compress out all occurrences of multiple blanks before records are compared. (also see 'Q' option) space 1 'D' - used with the 'A','O','R','S' options to acquire TOC summaries of program files. It forces the consideration of the time-and-date of creation in the comparison. space 1 'E' - forces output for every pair of input sequences downdated. (Minimum output being at least a *ELEMENT card or corresponding null entry in the output program file, depending on mode of downdate.) space 1 'L' - provides an expanded PRINT$ listing of the downdate. It gives deleted images and line numbers of all corrections printed corresponding to their respective files. space 1 'N' - used to inhibit PRINT$ of downdate. If the 'N' option is specified and there is no (or output file), no actual symbolic downdate is done but TOC summaries are allowed. (see 'D', 'A','O','R' and 'S' options) space 1 'O' - used in program file mode. It forces a comparison of program file TOCs for the occurrences of OMNIBUS elements. (see 'A' option) space 1 'P' - used to create a program file PCF or TCF (see 'T' option). If the 'P' option is specified a program file must be specified in . It may be used in either element mode or program file mode downdates with the provision that there is an element name logically available for any output element, see NOTES ON 'P' OPTION USEAGE for further explaination. In the most obvious case this means that if one attempts to downdate two data files under the 'P' option an error will be noted because there is no logical element name available to give to the output element. space 1 'Q' - used in conjunction with the 'B' option to force consideration of multiple blanks between delimited character strings (that is, those strings delimited by the or by the default, the apostrophe). space 1 'R' - used in program file mode. It forces a comparison of program file TOCs for the occurrences of RELOCATABLE elements. (see 'A' option) space 1 'S' - used in program file mode. It forces a comparison of program file TOCs for the occurrences of SYMBOLIC elements. (see 'A' option) space 1 'T' - used to downdate PCF elements, data files or program files to produce the appropriate TCF. (see 'P' option) space 1 'V' - used in conjunction with the 'L' option to create batch listings (132 columns) from demand. space 1 'X' - used to force the processor to ignore any element cycle information. This is provided as a last resort in an attempt to downdate non-standard SDF format files. (Note: @DOWN will automatically recognize PRINT$ and FORTRAN files and process control information accordingly.) space 1 'Z' - forces downdate of every element or PCF sequence encountered in either input stream. If the element or sequence has no counterpart in the opposing stream it is downdated against a null element or sequence. space 1 2Spec Fields: space 1 - 0 can be a data file, program file or program file element. If no file name is given, TPF$ is assumed. space 1 - 0 also can be a data file, program file or program file element. If no file name is given, TPF$ is assumed. If an element name was given in and none is given in , the name from is assumed. If both and designate program files, then a program file mode downdate is initiated. Unlike previous versions of @DOWN, there is now no restriction on downdating data files with program file elements. space 1 - 0 this is the output file field. If is is empty, it is assumed no output stream is desired. If it exists, it may also be a data file, program file element or program file. If, however, it is a program file, the 'P' option must also be present. This is necessary to indicate default element names, otherwise an error will be noted. space 1 - 0 has the format of: center //./ - all fields are optional. The obvious redundancy is done as an aid to the user, the is forced to the write key field to allow input of special characters. If both filename-read-key and element-version-name combinations are given the element-version-name parameters are used. column 8 space 1 / - 0 this is the 'WINDOW' option specification. It allows the user to downdate only over specified columns of the input images. For example, to ignore possible card sequence numbers, one might provide '1/72'. The only restriction is that, if given, must be greater than or equal to . - 0 is provided for the 'BQ' option downdate. The default is the apostrophe ('). This field is provided in the event the 'BQ' option may be desired for some language which uses some other character to delimit literal character strings. column 5 space 1 - 0 has the format of: center //./ - like the redundancy is allowed to aid the user. column 8 space 1 / - 0 these are program optimization parameters and by their nature are somewhat arbitrary. They default to 5/400, these being found to be reasonable values in the vast majority of cases. The factor is defined as the number of sequentially corresponding records necessary to confirm a match. The factor is the maximum number of images sequentially searched in order to attempt to satisfy the factor. (see NOTES ON OPTIMIZING PARAMETERS AND THE DOWNDATING ALGORITHM) - 0 this is the correction card prefix character which 'SIR' allows to be redefined. Its default is the usual dash or minus sign (-). Like the it uses the write key of the spec field to allow the introduction of special characters. column 2 1Notes on implementation and usage: space 1 2Program file mode: space 1 As was stated earlier, when and are both program files, program file mode is initiated. The user has some control over what is downdated and the subsequent output via the 'E' and 'Z' options. When neither of these are specified, only symbolic elements which have names common to both files are compared. Subsequent output will only occur if some difference is found between a compared pair of elements. The 'E' option is provided to force what may be called a minimum 'null output' even if both elements compared are identical. This 'null output' may simply be in the form of a *ELEMENT card in the output stream or a null entry with corresponding element name in the output 'P' option program file (see NOTES ON 'P' OPTION USAGE). This may seem trivial at first glance but it is often desired in the creation of PCF of TCF program files to be used by SSG. The 'Z' option forces output for all entries entered in both program files. If an entry is encountered which has no counterpart in the opposing program file, under the 'Z' option, it will be effectively downdated against a null element. With the 'Z' option one can create a PCF or TCF which will literally map all the symbolic elements of into symbolic elements of . Output, reguardless of options, is always in 'sort order', that is the order of output sequences will always obey the natural FIELDATA collating sequence. space 1 2Summary options: space 1 The summary options 'D', 'A', 'O', 'R', and 'S' allow one to look for differences in the entries of program file TOCs. They are a natural extension of the tree handling mechanisms used to process and sort entries for program file downdates. Currently they supply a minimal amount of information reguarding the differences of program file TOCs. In the future this facility may be extended if it is determined that other useful information can also be displayed. space 1 2Listing control: space 1 The 'N', 'L', and 'V' options provide the user with some degree of control over the PRINT$ listing of downdated output. The 'N' option is provided to suppress PRINT$ output of the actual downdate. It does not, however, affect the printing of error messages or summary options information which must be specifically asked for. If the 'N' option is specified and no is given, no actual symbolic downdate is attempted, but summary options are allowed. If at the same time no summary options have been given, the execution is determined as being useless and an error will be noted. In fact, if only summary information is desired, it is recommended that the 'N' option with no also be specified - lest one takes the chance of obtaining more output then one desires. The 'L' option is provided to give the user the maximum amount of information on a particular symbolic downdate. It supplies the differing images of both files along with their respective line numbers. The output is formated to 72 or 132 characters depending on demand or batch execution, respectively. If one is executing from demand and wishes a batch listing, the 'V' option is supplied. If the 'L' option is not specified the output is printed, unformated, up to 132 characters and truncated after that. space 1 2Pcf downdate: space 1 The PCF downdate is triggered by the presence of the 'T' option. The output is in the form of a TCF capable, via SSG, of transforming the PCF of into the PCF of . There are no logical restrictions to the file types in or over what was previously stated. The 'P' option is also valid and if properly used will result in a TCF program file as output. If input is in the form of a PCF element or data file containing *ELEMENT sequences, these sequences are treated effectively identically to elements within a program file. With this mode of handling, the 'E' and 'Z' options have the same meaning as was described in NOTES ON PROGRAM FILE MODE, and will produce the same effect as if the PCF input streams had actually been PCF program files. The point should be made that the 'T' option is provided to downdate PCFs not TCFs. The distinction is this: TCFs may contain 'relative' PCF correction images which are themselves uncorrectable. If @DOWN is in 'T' option mode and encounters such an image, a format error will be noted, the card will be treated as an effective *ELEMENT card or EOF delimiter and execution will attempt to continue. In short, if one wishes, one may downdate TCFs in this manner but close attention should be paid to any error messages printed because erroneous output may follow. space 1 2The 'P' option: space 1 In the most general case the 'P' option was meant to be used in the program file mode, thus the name for each output element would come from the name of the elements being compared within opposing program files. The 'P' option is not, however, restricted to the program file mode. The only restriction is that there must be some logical default name to be given to any potential output element. In the case of 'last resort', the element name is taken from the element of , if one exists. If it does not exist, an error is noted and the program terminated. If the processor is not in program file mode but the 'T' option is specified and the input streams contain *ELEMENT cards, then these cards will name the output elements. The problem comes when there is a data file in with no 'T' option, or with a 'T' option but no *ELEMENT cards in the data stream. In this case there is no logical output element name to be used, thus the error will be noted. The 'P' option is meant to create PCF or TCF program files for subsequent use by SSG. space 1 2The 'BQ' option, window spec, and translation: space 1 As was mentioned earlier, input images are treated as if the input character mode was effectively transparent. In other words, if two records to be compared are of different modes, translation is automatically made. The direction of translation is always ASCII -> FIELDATA. This choice was made for internal convenience, but it has yet to be shown that the inverse translation is really any more meaningful. It should be noted that all character manipulation that is done, whether it be elimination of multiple blanks, invoking the window specification, or character translation, is done at the time of each required comparison. This may superficially seem grossly inefficient, but there were several reasons for this approach. First it was felt that these mechanisms denoted special purpose usage and as such would be used only infrequently. This implementation preserves the basic I/O structure which makes the downdater as efficient as it is and yet makes the actual input record forms transparent to the main line routines. Secondly, in the general application the density of differences is far smaller then the size of the total file. This method is probably more efficient in such cases then recreating both entire files in some modified form and downdating the result. Lastly, in many applications the downdater is still I/O bound, thus using more of the CPU during overlapped I/O does not appreciably degrade program performance. space 1 2Optimizing parameters and the downdater algorithm: space 1 As was previously mentioned, the factor and factor are optimizing parameters. The downdater is a sequentially oriented comparator and as such is not defined to provide the 'optimum' set of correction cards. The 'optimum' set being defined as the minimum number of possible correction cards to change one file into the other. Instead, the downdater produces 'a' set of corrections cards sufficient to achieve the same end. It is the interaction of these parameters and the physical nature of the files being considered which determine the closeness of the product to that 'optimum' set. In fact, the downdater algorithm does a very good job at approximating this optimum set in the vast majority of cases and generally is relatively insensitive to changes in the optimizing parameters. There are, however, characteristics of input data which may tend to degrade the result and can be watched for. For example, if one is downdating two files which have repetitive blocks of the same images, the factor should be set greater than the size of a repeated block. Analogously, if images have been moved within a file a displacement greater than 400 records, the should also be increased. The changes would effectively 'optimize' the output but they would also degrade the speed of the processor. Processor speed can be improved by lowering the optimization parameters but at the probable cost of generating larger correction decks. Finally a comment on input image length is required. In theory there is no limit to the input image length (within SDF defined format), however, an internal buffer must be maintained for images which span internal page boundries. This internal buffer is now set at 60 words, that is 240 characters ASCII, or 360 characters FIELDATA. Images exceeding this length and falling totally within a page boundry (448 words) will be compared normally. If it is necessary to move such an image to the internal buffer it will be truncated to 60 words - NO error will be noted. If the user wishes to use the processor on files with records greater than 60 words in length, he does so at his own discretion.