Replication

Replication

Top  Previous  Next

 

Data replication provides a mechanism whereby updates to selected files are automatically exported to one or more other files, usually on a different system in order to provide resilience to hardware failure. In normal usage, the intention is to maintain an identical copy of the original files, either so that the second system can take over or for use as a reporting server. It is, however possible to use the replication facility to copy data from disparate sources into a single file, perhaps merging data from satellite offices into a single file at head office.

 

 

Basic Principles

 

The system that exports data is known as the publisher. The system receiving the data is known as the subscriber. Normally this would be a simple relationship with just one publisher/subscriber pair, however, QM imposes few restrictions on how replication is used.

 

Each published file may be exported to multiple subscribers. A subscriber system can publish these files onwards to further subscribers, cascading data through a series of servers. The only rule is the sequence of exports must not contain a loop where a file is replicated back to itself. There is an exception to this rule as described below under "symmetrical replication".

 

A subscriber may receive data from any number of publishers.

 

QM does not prevent other updates to the replicated files on the subscriber system. Although use as a standby system probably requires that updates are not allowed, some other uses of the replication mechanism can require the ability to apply local updates. It is the application designer's responsibility to ensure that appropriate rules are imposed.

 

Replication can be used with hashed files and directory files but, because QM is unaware of changes made to directory files from outside of QM, only updates made by QM applications will be replicated. The subscriber systems may import the data into any type of file. Also, changes made to directory files using sequential file processing (WRITESEQ, WRITEBLK, etc), OSWRITE or OSDELETE are not replicated. Some implications of this are that query processor CSV or delimited reports directed to a file and QMBasic compiler listing files are not replicated. As a general rule, replication should not be used to publish files from the QMSYS account.

 

Note that replication of a file that uses case insensitive record ids into a file that uses case sensitive record ids is unlikely to work correctly as the update applied at the subscriber will have the casing of the record id as used in the update on the publisher system. Thus updates to a record named "ABC" and a record named "abc" on the publisher in a file with case insensitive ids will reference the same record but these updates replicated to a subscriber file with case insensitive ids will access different records. This is no different from the similar problem that can occur when copying records between case insensitive files and case sensitive files on the one system.

 

The replication export is not instantaneous. The subscribers poll for updates at an interval that is configurable in the range 1 to 300 seconds, defaulting to five seconds. At this default rate, the subscriber system will be on average about three seconds behind the publisher though peak loads may increase this time due to network delays. For a true hot standby where the application does not continue until the update has been committed on the standby system, use QM triggers. The network delays associated with this approach will have a substantial effect on application performance.

 

A summary report of the replication system status can be produced by use of the RPL.STATUS command. This is normally only present in the VOC of the QMSYS account but can be copied to other accounts if needed.

 

 

Setting up the Publisher

 

The publisher side of replication can only be enabled if it is included in your QM licence.

 

There are four configuration parameters related to replication on the publisher system.

 

The mandatory REPLDIR parameter specifies the directory where QM will write the log files that contain data waiting for transfer to the subscribers. This directory will be created automatically when QM is started if it does not exist. There will be a separate subdirectory under this location for each subscriber.

 

The REPLMAX parameter specifies the maximum number of simultaneous replication targets that are allowed. Publishing a single file to a large number of targets may have a significant performance impact.

 

The REPLPORT parameter specifies the port on which QM will listen for incoming connections from subscriber systems. This defaults to port 4244 if omitted which is the default used by the subscriber.

 

The REPLSIZE parameter specifies the approximate maximum size for a replication log file in Mb before a new log file is started. The default value is 10 which should be appropriate for most systems.

 

A file is published by use of the PUBLISH command by a user with administrator rights:

PUBLISH {DICT} filename {DICT} server:account:filename

where the first filename references the file to be published and the server:account:filename references the file to which updates are to be exported. Multiple targets may be specified in a single PUBLISH command. The subscriber system identified by server must have been defined with SET.SERVER as the command will check for the existence of the target file, however,  it is not necessary to enable extended filename syntaxes (see the FILERULE configuration parameter).

 

The PUBLISH.ACCOUNT command can be used to simplify publishing a large number of files to the same target server/account.

PUBLISH.ACCOUNT server:account

This command will show a list of files in the current account that are not already published to the specified target. The user can then select files from this list that are to be published.

 

From the point when the PUBLISH command is executed, all updates to the published files will be logged in the directory specified by the REPLDIR parameter.

 
See Replication Example for a step by step example of setting up replication.

 

 

Setting up the Subscriber

 

The subscriber side of replication is not separately licensed.

 

When using replication to create a standby system, it is essential that the target files correctly match the corresponding files on the publisher system before enabling subscription. The easiest way to do this is to copy the file at the operating system level while not in use and before setting it up for publishing. Alternatively, copy the file using the QM COPY command after setting it up for publishing. In the latter case, updates to the publisher system that occur during or after the copy will be applied when subscription is enabled.

 

The REPLSRVR configuration parameter must be set to specify the name of the subscriber server. This must match the server name used in the PUBLISH commands but does not need to be the same as the network name of the subscriber system.

 

The NUMFILES configuration parameter must have a value that allows all subscribed files on the system to be open simultaneously.

 

The SUBSCRIBE command is used by a user with administrator rights to apply updates from the publisher to the subscriber. This command starts a phantom process that manages the actual data transfer and is probably  best initiated from a STARTUP configuration parameter.

SUBSCRIBE server{:port} username password {options}

The server element of this command may be the network name or ip address of the publisher. The port number defaults to 4244 if omitted. The user name and password must be valid for connection to the publisher system. The password should preferably be encrypted using the AUTHKEY command for security and specified with a prefix of ENCR: in the command. A server process running as this user will be established on the publisher.

 

See the SUBSCRIBE command for details of the options. See Replication Example for a step by step example of setting up replication.

 

 

Important Note: Because information about the replication targets is stored in the file, copying the directory that represents a QM file that uses replication will result in the new file also replicating to the same targets.

 

 

Transactions

 

Updates that were part of a transaction on the publisher will be applied to the subscriber as a group but not strictly as a transaction. There should be no impact on the replication process or data integrity.

 

 

Encryption

 

Data traffic between the publisher and the subscriber is encrypted though there is an option to suppress this, perhaps when the data path is entirely within a secure network.

 

If files that use record or field level data encryption are published, the data is exported in encrypted form and imported at the subscriber without change. The implication of this is that the target files on the subscriber system must use the same encryption level and key strings as the source files. This is a logical step if replication is not to weaken system security. Also, the user name of the subscriber phantom must have full access to all relevant encryption keys.  Encryption and replication cannot be used together on a directory file.

 

 

Alternate Key Indices

 

Because the SUBSCRIBE command writes data in the same way as any other QM process, indices on the subscriber system will be maintained automatically. The indices defined on the subscriber can be different from those on the publisher.

 

 

Triggers

 

Trigger functions are not applied on updates at the subscriber. Where triggers are used for data validation on the publisher, the data does not require revalidation on the subscriber. Where triggers are used to trigger other events on the publisher, any updates from these events to published files will also be replicated.

 

 

Permissions

 

All QM users who may update exported files must have write access to the directory specified by the REPLDIR parameter on the publisher system. The replication system will set the access rights automatically as each new log file is created.

 

The user name under which the subscription server process runs on the publisher must have read access to all files that are to be published. Because encrypted data is transferred in encrypted form, this user does not need to have access to encryption keys.

 

The user name under which the SUBSCRIBE command runs on the subscriber system must have full access to all subscribed files. When replicating to directory files on the subscriber on a Linux system, new data records will be created with the default umask setting for the system (not one set by the Linux profile script as this does not run for phantom processes). It may be useful to add a UMASK command to the LOGIN paragraph of the QMSYS account.

 

 
Cancelling Replication

 

Replication of a single file can be cancelled by using the CANCEL keyword of the PUBLISH command. Replication of an entire account can be cancelled using the CANCEL keyword with PUBLISH.ACCOUNT.

 

 

Copying Replicated Files

 

The details of the replication targets for a published file are stored within the file. If a published file is copied using operating system tools or restored from a backup to a new location, it will usually be necessary to cancel replication of the new file.

 

 

Failure Situations

 

Where replication is used to provide resilience, it is important to understand the failure situations for which it provides protection and how to recover from them.

 

1. Failure of the publisher

 

If the publisher system fails, the subscriber system can take over. Updates that are currently logged on the publisher but have not been exported to the subscriber will not be present.

 

The recovery procedure depends on how the subscriber system has been used during the failure period.

 

If no changes have been made to replicated files on the subscriber system, no action is necessary. The subscriber system will periodically attempt to re-establish connection with the publisher and, when this is successful, the pending updates will be applied.

 

If users have logged into the subscriber as a standby system and made updates to replicated files, recovery requires a little work.

 

a) Terminate the subscriber process before restarting the publisher so that it does not attempt to reconnect.

 

b) Consider whether any updates in the replication log files on the publisher are to be allowed to be applied, possibly overwriting change made on the subscriber or whether they should be discarded. If updates are to be discarded, login on the publisher when it is restarted, shutdown any QM processes that may be accessing published files (e.g phantoms created using the STARTUP configuration parameter), and delete the content of the relevant log file directory. This is a subdirectory of the same name as the subscriber system under the directory identified by the REPLDIR configuration parameter. Delete only the files, not the directory.

 

c) With no QM users logged in on the subscriber system, copy the relevant files back to the publisher using the QM COPY command, not an operating system level copy.

 

d) Allow users back into QM on the publisher.

 

e) Restart the subscriber process.

 

 

2. Failure of the subscriber

 

If the subscriber system fails, the publisher will continue to run with updates queued in the replication log files. When the subscriber comes back online, these updates will be applied automatically.

 

Updates are queued in log files that are limited to approximately 10Mb, switching to a new log file when this limit is reached. Log files are discarded when they have been completely processed. If the subscriber fails (or the subscription process is not running), the publisher system will generate a message in the error log file if the backlog of updates has three or more files (approximately 30Mb). This message will be repeated at hourly intervals until the situation is resolved.

 

 

3. Failure of the network

 

Network failure is essentially no different from subscriber failure except that the data is still available for reporting purposes. Updates will restart automatically shortly after the network connection is restored.

 

 

Symmetrical Replication

 

Where replication is used in a simple one publisher / one subscriber configuration, symmetrical replication can significantly simplify the failure and recovery process.

 

The two systems are configured to publish files to each other. Updates made on either system will be replicated on the other but the replication process will not copy an update from the subscriber back to the publisher from which it came, thus avoiding an endless cycle of updates between the two systems.

 

If the primary publisher system fails, users login to the subscriber standby system and continue work. Updates that they make will be logged for replication back to the primary system and will be applied when it comes back online. The same considerations apply as described above regarding updates that were logged on the primary system but did not get sent to the standby system immediately prior to the failure.

 

On return of the primary system, users can be moved back at a convenient moment. No data files need to be copied manually.

 

Whilst symmetrical replication provides a simple recovery strategy, it is very important to ensure that updates are not made to the same records on both systems at the same time as there is no locking between the two systems to control the sequence in which updates would get applied.

 

 
Disabling Publishing

 

The DISABLE.PUBLISHING command can be used to suspend recording of updates to published files in the replication log area. It is intended for use when synchronising the publisher and subscriber systems. Updates applied to published files while publishing is disabled will not be replicated. Account replication (see below) is also disabled by this command. Publishing can be resumed using the ENABLE.PUBLISHING command and resumes automatically if QM is restarted.

 

 

Moving the Replication Directory

 

If it is necessary to move the directory that stores replication log files to a new location, perhaps for load balancing across disk drives, this can be accomplished simply by disabling publishing, modifying the REPLDIR configuration parameter, moving the directory structure, and re-enabling publishing.

 

 

Identifying Replicated Files

 

The files for which replication is enabled can be shown using the LIST.REPLICATED command. The LISTF, LISTFL and LISTFR commands also report files with replication but do not show dictionaries.

 

 

Account Replication

 

When using replication to maintain a standby system, it may be useful for the action of some commands executed on the publisher to be replicated on one or more subscriber systems. This capability is provided by an extension to the replication system that is largely independent of the mechanisms discussed above.

 

To activate account replication, create an X-type VOC item named $REPLICATE in the relevant account(s) on the publisher system. Field 2 of this record should hold the name of the account in which command actions are to be replicated in the form

servername:account

where servername is an item previously set up with the CREATE.SERVER command and account is an account on this server. Multiple targets may be specified separated by value marks. The QMCLIENT configuration parameter must be set to zero (its default value) on the remote system. Beware that this may weaken system security.

 

Fields 3 onwards contain a list of the commands that are to be replicated on the remote system(s) from the list below.

ADD.DFAdd an element to a distributed file
BUILD.INDEXBuild an alternate key index
CATALOGUEAdd program to the system catalogue. The American spelling may be used.
CREATE.FILECreate a file
CREATE.INDEXCreate an alternate key index
DELETE.CATALOGUEDelete a program from the system catalogue. The American spelling may be used.
DELETE.FILEDelete a file
DELETE.INDEXDelete an alternate key index
GRANT.KEYGrant access to an encryption key
MAKE.INDEXCreate and build an alternate key index
REMOVE.DFRemove an element of a distributed file
REVOKE.KEYRemove access to an encryption key
SET.FILESet a Q-pointer to a remote file

 

With the exception of the two alternative spellings referenced above and the interchangeability of hyphens and periods in the names, the account replication system replicates only the commands as listed in the $REPLICATE record. Use of a command that is a user added synonym for one of the above list will not be replicated unless the synonym is also added to the $REPLICATE record. In this case, the synonym must also be present on the remote system.

 

Because synonyms not in the above list are not replicated, it is possible to create synonyms for replicated commands that will themselves not be replicated. For example, copying the CREATE.FILE VOC record to CREATE.LOCAL.FILE would allow a choice between CREATE.FILE for file creation that is to be replicated and CREATE.LOCAL.FILE to be used for file creation that is not to be replicated.

 

Any command name in the $REPLICATE record may be followed by a list of space separated mode options . Currently, the only option supported is QUERY which causes the user to be prompted for confirmation that the command is to be replicated. Beware that this option would cause the command to fail in a phantom process and may cause other problems in a QMClient session.

 

Replication of the CATALOGUE command also updates the object code in the appropriate file on the target system.

 

In addition, presence of

AUTO.PUBLISH

in the command list will cause files created using a replicated CREATE.FILE command to be published automatically to a file of the same name in the remote account(s).

 

Changes to the $REPLICATE record take effect only on next entry to the account, either as a result of a new login or by use of LOGTO.