Directory Files

Directory Files

Top  Previous  Next

 

A directory file is represented by an operating system directory and the records within it by operating system files. The record key is the name of the operating system file holding the data for the record except where this would be an invalid name in which case QM performs automatic name mapping as described below.

 

Directory files do not give the high performance of the hashed file system but are mainly used for data that is to be processed from outside of QM or for very large records (hundreds of kilobytes). Typical uses include storage of QMBasic programs, COMO (command output) files, and saved select lists.

 

Records in directory files may be read, written or deleted by applications in exactly the same way as records in hashed files. The QMBasic programming language provides some additional operations for directory file access. A record may be opened using the OPENSEQ statement and then processed as a sequential file on a line by line basis (READSEQ, WRITESEQ, etc) or as a simple binary item (READBLK, WRITEBLK, etc). In addition, programming statements are provided to simplify processing of comma separated variable format data (READCSV, WRITECSV).

 

On all systems except Windows, the DIR.DTM mode of the OPTION command can be used to cause writes to directory files or closing a sequential file that has been written to update the date/time modified of the directory.

 

The F-type VOC entry for a directory file has the pathname of the directory that represents the file in field 2.

 

Applications frequently open the same file multiple times with different file variables in the same process. The number of simultaneous opens of the same directory file in a single process cannot exceed 255. Copying a file variable for an open file does not count as a separate open.

 

 

Record Ids

 

Where a record id contains characters that are not valid in operating system file names, QM automatically replaces them with an alternative representation. This is totally invisible from inside QM but other software that accesses directory file records must allow for these translations. Rather than have a different set of translations for each platform, QM adopts a single set based on the most restrictive platform (Windows) so that data may be moved between environments without modification of record names. The translations performed are:

*

%A

%

%P

\

%B

"

%Q

,

%C

/

%S

=

%E

+

%V

>

%G

:

%X

|

%J

;

%Y

<

%L

?

%Z

 

In addition, use of a period or tilde character as the first character in the name will translate this character to %D or %T respectively. QM uses names with an untranslated leading percent character to represent the subfiles of a hashed file. External applications should avoid writing such items into directory files as they may cause confusion.

 

Character mapping in record ids can be suppressed by use of a flag in field 6 of the F-type VOC entry that defines the file or by use of the NO.MAP option to the QMBasic OPEN, OPENPATH, OPENSEQ and DELETESEQ statements.

 

The length of a record id after translation of invalid characters as above must not exceed 255 bytes.

 

Depending on the operating system in use, record ids in directory files may be case insensitive.

 

Note that the Windows file system does not allow file names that clash with Windows device names such as AUX, COM1-COM9, CON, LPT1-LPT9, NUL or PRN. Also, Windows ignores a trailing period in a file name. Thus files named "ABC." and "ABC" are the same item. Use of any of these as a record id in a directory file will cause an illegal record id error.

 

 

Mark Mapping

 

When a record is written to a directory file, any field mark characters are converted to the operating system dependent representation of a newline (See "line terminator selection" below). Thus, each field becomes a line of text which allows the data to be processed by external software that does not understand the concept of field marks. Conversely, when data is read from a directory file, the newlines are translated to field marks. Where the data contains value marks or subvalue marks, these are not translated as it is assumed that whatever software will process this data must understand multivalued data.

 

One common use of directory files is to store scanned documents, digital photographs, etc. In this case, the data is not text divided into fields using the field mark character but is simple binary data that may contain any sequence of bytes. The data will nearly always contain bytes that appear to be field marks and other bytes that are the ASCII linefeed character. On writing the data to disk, the field marks will be converted to newlines. On reading the record back again, all of the newlines get converted to field marks such that the record does not match the original data written. This is clearly unacceptable. Application developers using directory files to store binary data must suppress the translation of field marks by use of the QMBasic MARK.MAPPING statement.

 

 

Line Terminator Selection

 

To allow a directory file opened with OPEN or OPENSEQ to be processed on a different system type from where it was created, QM provides four alternative ways to handle the line terminator. The default mode is appropriate to the operating system in use but can be changed using the FCONTROL() function or by use of a mode flag in field 6 of the F-type VOC record that defines the file.

 
Mode 0 (LF)  When reading data, the line terminator is a line feed (LF) character. Carriage return (CR) characters are treated as part of the data. The WRITE statement converts field marks to LF characters. The WRITESEQ statement adds an LF after each line of text. This mode is the default on Linux/Unix systems. VOC mode flag L selects this mode.

 

Mode 1 (CRLF)   When reading data, the line terminator is a CR/LF pair. CR or LF characters that are not paired are treated as part of the data. The WRITE statement converts field marks to a CR/LF pair. The WRITESEQ statement adds a CR/LF pair after each line of text. VOC mode flag C selects this mode.

 

Mode 2 (BOTH)   When reading data, both a CR/LF pair and a lone LF are recognised as the line terminator. Lone CR characters that are not paired are treated as part of the data. The WRITE statement converts field marks to the standard operating system representation of a new line (a CR/LF pair on Windows, LF on Linux/Unix). The WRITESEQ statement adds the operating system line terminator after each line of text. This is the default for sequential files opened with OPENSEQ on Windows. VOC mode flag B selects this mode.

 

Mode 3 (NOCR)   As for mode 2, when reading data, both a CR/LF pair and a lone LF are recognised as the line terminator but lone CR characters are discarded. The WRITE statement converts field marks to the standard operating system representation of a new line (a CR/LF pair on Windows, LF on Linux/Unix). The WRITESEQ statement adds the operating system line terminator after each line of text. For compatibility with earlier releases, this is the default for directory files opened with OPEN on Windows. VOC mode flag X selects this mode.

 

Tokens for the mode names are provided in the SYSCOM KEYS.H record as FC$CRLF.LF, FC$CRLF.CRLF, FC$CRLF.BOTH and FC$CRLF.NO.CR.

 

 

Character Encoding

 

A character encoding can be applied to directory file data. This can be specified in three ways:

Field 7 of the VOC F-type entry sets the default encoding to be used.

The QMBasic OPEN or OPENPATH statement can specify an encoding that will override any encoding set in the VOC record.

QMBasic read and write operations can specify an encoding that will override any encoding set in the VOC record or on opening the file.

 

Note that specifying the encoding as a null string in a file open operation or a read/write operation is equivalent to omitting the ENCODING clause and will, therefore, retain any default set earlier. If it is necessary to disable an encoding, the encoding name "NULL" should be used.

 

On ECS mode systems, if no encoding is set, only the low order byte of each character will be written to the file.

 

 

System Related Items in Directory Files

 

The directory that represents a QM directory file may contain a file named %header%. If this is present, it contains control information used by QM and must not be modified or deleted. This item will not appear in the results of a select operation against the directory file and will not be deleted by CLEAR.FILE. If a directory file is copied to an alternative location using operating system level commands (e.g. as a backup), it may be necessary to delete the %header% file in the copy if the copy might be updated from within QM. This is necessary as the %header% file controls replication and triggers.

 

 

Careful Update

 

When writing a record to a directory file, QM normally opens the operating system file that will represent this record and writes to it, overwriting any existing data. There is a possibility of data loss when updating an existing record if the system fails during this write (e.g. a power outage) or if there is insufficient disk space. To prevent this, the SAFEDIR configuration parameter can be set to adopt a "safe update" technique where the data is written to a temporary file, the original is deleted and the temporary item is renamed to replace the original. This removes nearly all possibility of losing the record but degrades performance of the write.

 

 

Directory File Dictionaries

 

Just like any other QM file, a directory file can have a dictionary containing definitions that are specific to this file, however, directory files frequently contain simple text records that do not need field definitions. There is a generic directory file dictionary named DIR_DICT available from the QMSYS account. To use this, either create the directory file without a dictionary or use DELETE.FILE to delete the dictionary portion of an existing file. Then edit the VOC entry to set field 3 to be

@QMSYS\DIR_DICT

using the directory separator appropriate to your platform.

 

This dictionary contains some useful items:

DAYSThe number of days since this record was modified.
DTMThe date and time of the last modification to the record.
OWNERThe user name of the owner of the record (not Windows).