The Virtual File System

The Virtual File System

Top  Previous  Next


The Virtual File System (VFS) allows application designers to provide access to data that appears to an application as a file but may actually be something quite different. Possible uses of the VFS include:


Providing access to data in other database environments.

Accessing data transparently over a network where QMNet is not appropriate.

Implementing an alternative encryption layer on top of standard QM files.


There are two styles of VFS handler; internal using a QMBasic class module to perform the file system operations, and external using a C program. Development of an external handler requires skills beyond QMBasic but allows access to run time libraries provided by other database vendors.


The role of the VFS handler is largely to make the external data appear to be a QM file. It may be possible to achieve very close compatibility but often there are some features that are difficult to link between the two environments. For example, the index scanning operations of QM may have no direct equivalence in the external data source. Where it is not possible to provide full compatibility, some features of QM may not be available when using the VFS.


Before using any of the VFS handlers available via the web site, take care to read the comment text at the head of the source code as this may include important issues to consider before using the VFS.



The Internal VFS Handler Class Module


A VFS handler is a globally catalogued QMBasic class module that intercepts all attempts to access the file (read, write, select, etc). It processes requests, storing or retrieving data as appropriate.


A template class module named VFS.CLS is provided in the BP file of the QMSYS account. This includes a brief description of each of its component functions and subroutines.



The External VFS Handler C Program


A heavily commented skeleton program is available for download from the web site. There are also variants adapted to interface with other products, including (where redistribution rights permit) pre-built versions that require no additional programming.


The skeleton program contains all of the C coding necessary to interface with a parent QM process and dummy functions corresponding to each of the file system operations handled by the VFS. These functions need to be modified to interface with whatever file system is to be accessed by the VFS handler. Unlike the internal VFS where there is a separate instance for each file opened, an external handler may be shared between multiple files opened from the same QM process. The handler communicates with its parent QM process through a pipe, ensuring that there is complete isolation between the two processes which, in turn, protects the QM process from the effects of any programming errors in the VFS handler.



Creating a Virtual File System


There are three steps; creating the VFS handler, defining the VFS server and creating the VOC entries.


Step 1: Creation of the VFS handler involves modification and compilation as described above.


Step 2: The handler name and server connection information is stored as a VFS server definition which is created using the SET.VFS.SERVER command. Some parts of this definition such as the remote server connection details may be irrelevant for a particular use. Once a VFS server definition has been created, the ADMIN.SERVER command can be used to provide more closely controlled user authentication.


For an internal VFS handler, the handler name corresponds to a globally catalogued VFS handler class module.


For an external VFS handler, the handler name has a case insensitive prefix of EXT separated from the rest of the name by a single space. The actual name of the VFS handler program is formed by adding a prefix of vfs_ to the  handler name. For example, the VFS handler available from the web site to access U2 files would be vfs_u2 on Linux or vfs_u2.exe on Windows. The handler program is started when the first file using that VFS server is opened and remains active until the last file is closed.


Step 3: Like all files, a VFS file is identified by an F-type VOC item. It is possible for only some parts of a file to be VFS items. Thus a file might have a VFS data part but a normal dictionary part.  The components of a multifile can be a mix of VFS and normal items.


A VFS item is identified by the pathname in the F-type VOC entry being specified as "VFS:server" where server identifies the VFS server definition which in turn identifies the VFS handler program and connection information. There is an optional third component to this syntax which will be passed to the handler on opening the VFS item. The full syntax of the VOC item is thus


This third component could be used, for example, when a single VFS handler is used to access many files. The qualifier might be the pathname or other reference to the actual file to be opened by the handler. It may include further colons dividing it into sections in any way that the developer wishes.





It is important to note that using the VFS to access dictionaries on remote systems may have unexpected effects. For example, TRANS() functions or correlatives using T-conversions will be compiled in the context of the local QM system and hence file names on items within the system accessed via the VFS may be meaningless.


As a general rule it is best to create local copies of the dictionaries with appropriate VFS file name references to avoid this problem.



Partial Select Lists


The QM file system optimises select list generation by arranging that the QMBasic SELECT statement (not the query processor equivalent) used against a hashed file actually performs the select group by group as the READNEXT statement is used to walk through the list. Anything that requires the list to be completed (e.g. using SELECTINFO() to determine the number of items in the list) will cause the remainder of the list to be constructed immediately.


A VFS handler can work in much the same way. The V$SELECT function can return the entire list or just the initial part of the list. When a program processing this list reaches the end of the list returned by V$SELECT, the V$CONTINUE.SELECT function is called to return the next part of the list. The V$COMPLETE.SELECT function will be called if the remainder of the list should be returned as a single item and the V$END.SELECT subroutine is called to terminate generation of a partial list. In an external handler, the function names referenced above have an underscore in place of the dollar sign.


Internal VFS handlers that do not use partial list construction can omit the V$CONTINUE.SELECT, V$COMPLETE.SELECT and V$END.SELECT entry points. External VFS handlers must include these functions, returning an empty list from the first two if not implemented.



Alternate Key Indices


A VFS handler that cannot accurately provide the documented behaviour of the INDICES(), SELECTINDEX, SELECTLEFT, SELECTRIGHT, SETLEFT and SETRIGHT operations may lead to issues with, for example, using the query processor against the VFS file. It may be best simply to return a null string from the INDICES() function so that the file appears to have no indices.


Alternatively, if the VFS handler supports simple index look up operations (SELECTINDEX) but not the index scanning operations, it should be written such that use of FILEINFO() with key FL$INDEX.SCAN returns False (0). This will cause the query processor to reject an index when using relational operators that require the unsupported functions.



Extended Filename Syntax


If enabled using additive value 8 to the FILERULE configuration parameter, a VFS file may be referenced using an extended filename syntax:


where server is the name of a VFS server defined as described above and qualifier is the optional qualifying data defining the file to be opened. Because some VFS handlers may need to process filenames that include spaces (e.g. DICT name), any tilde characters (~) in the qualifier will be replaced by spaces.



The VFS Cache


The VFS includes an optional caching mechanism whereby a file that is closed by an application is actually held open in a cache so that a further open for the same file will execute faster by taking the file back from the cache. Use of TRANS() or Tfile conversions in a query is likely to benefit considerably from use of this cache. The size of the cache is set by the VFSCACHE private configuration parameter in the range 0 to 50. This mechanism is very similar to the file cache used by hashed files and the two caches are cleared together on return to the command prompt or on use of the FLUSH.DH.CACHE QMBasic statement as well as automatically at some points where a cached file could cause problems.