5. Record and Replay Messages with the AMPS Transaction Log

AMPS provides the ability to record topics and replay those topics at a later time. This capability is called the transaction log.

The AMPS transaction log fully supports topic and content filtering. You configure the transaction log to keep a journal of incoming messages for one or more topics, and then you can replay those messages, in order, from any point in time. With the (optional) high-availability features in the AMPS client libraries, this also provides a way to ensure that in case of failure, an application can resume the subscription without missing messages or receiving duplicate messages.

The AMPS transaction log is most often used for:

  • Fully Resumable Subscriptions - With the transaction log, you can ensure that an application receives all messages of interest, even in the event of a failure.
  • Backtesting and Audit - The transaction log allows you to replay the precise messages published, in order across all topics in the instance, at a configurable maximum rate. You can use this feature to easily audit the flow of messages, perform backtesting, or replay a sequence of events.
  • Capacity Planning and Stress Testing - Since the transaction log allows you to set the maximum replay rate to be a multiple of the original publish rate, you can use the transaction log to measure the load on a system at various rates, and measure the capacity of the system and the ability of your application to correctly handle increased volumes.

The transaction log is also the source of messages for AMPS replication and AMPS message queues.

AMPS can typically record messages at the maximum throughput of the underlying storage device. 60East recommends storing the transaction log on a device that supports fast sequential writes, and ensuring that the device has the speed and capacity necessary to support the expected throughput. (The operations chapter of the User Guide includes guidance on capacity planning.)

How Does the Transaction Log Work?

The AMPS transaction log records messages that are published to the topics specified in the configuration file. Every publish is stored, in the order in which the AMPS instance processed the message.

For ease of maintenance, the AMPS server writes multiple sequential files, called journal files, for the transaction log rather than writing a single large file. The journals contain the full content of each message, as well as information on the topic, the publisher, the time at which the message was processed, and so on. The AMPS configuration sets the maximum size of a journal file. When a file reaches that size, AMPS begins writing to the next file.

When AMPS records a message into the transaction log, it assigns each message a bookmark. The bookmark identifies a single message, that is, a specific point in the transaction log of the local instance.

The AMPS server does not modify the contents of journal files. Once a message is written to a journal file, it is part of the transaction log and is considered to be immutable.

Since journal files form part of the persistent state of the server, those files should not be modified or removed while the AMPS process is running except by the AMPS process itself. The AMPS server provides a set of maintenance actions for managing journal files (see the AMPS User Guide for details).

Configuration

To create a Transaction Log, you add the TransactionLog configuration element to your AMPS configuration file. You then specify a location for AMPS to create journal files, and specify the topics that you want recorded in the file.

The following configuration is the minimum configuration to create a transaction log and record a single topic:

<TransactionLog>
    <JournalDirectory>./journals</JournalDirectory>
    <Topic>
       <Name>some-topic</Name>
       <MessageType>json</MessageType>
    </Topic>
 </TransactionLog>

The configuration above writes journal files to the journals directory underneath the AMPS server’s current working directory. The configuration records a single topic, some-topic, of message type json to the transaction log. The Name option of the Topic configuration element can be either a literal topic name, or a regular expression that matches the names of a set of topics to be recorded.

Although this configuration works perfectly well, AMPS provides a number of additional options that are useful for managing transaction logs in production. AMPS also provides a set of administrative actions for setting the archival and retention policy for journal files.

A more complete configuration might include options along the following lines:

<TransactionLog>
    <JournalDirectory>./journals</JournalDirectory>
    <JournalArchiveDirectory>/mnt/high-capacity/journals</JournalArchiveDirectory>
    <JournalSize>100MB</JournalSize>
    <Topic>
       <Name>^/orders</Name>
       <MessageType>json</MessageType>
    </Topic>
    <Topic>
       <Name>^/status/customer</Name>
       <MessageType>fix</MessageType>
    </Topic>
    <Topic>
       <Name>/audit/events</Name>
       <MessageType>binary</MessageType>
    </Topic>
</TransactionLog>

<Actions>
   <Action>
     <On>
         <Module>amps-action-on-schedule</Module>
         <Options>
           <Every>21:30</Every>
           <Name>Daily Journal Maintenance Plan</Name>
         </Options>
     </On>
     <Do>
         <Module>amps-action-do-archive-journals</Module>
         <Options>
           <Age>3d</Age>
         </Options>
     </Do>
     <Do>
         <Module>amps-action-do-remove-journals</Module>
         <Options>
           <Age>7d</Age>
         </Options>
     </Do>
   </Action>
</Actions>

In this configuration, journals are created in the journals directory underneath the AMPS server’s current working directory, as before. This configuration records two sets of topics and one individual topic. Taking these in the order in which they appear in the configuration file, this instance of AMPS will record:

  • Any topic that begins with /orders and is of message type JSON.
  • Any topic that begins with /status/customer and is of message type FIX.
  • The topic /audit/events of message type binary.

The sample above also includes a basic journal maintenance configuration. Configuring journal maintenance is strongly recommended for any instance of AMPS that will be running on a regular basis.

For this AMPS installation, the size of the journal files has been reduced from the default 1GB size to a 100MB size. This typically indicates that the instance stores less than 1GB of messages during a day, so the default journal size would include multiple days worth of messages.

This configuration specifies a two-step maintenance process:

  • After 3 days, journal files will be archived to the /mnt/high-capacity/journals directory (that is, the directory specified in the JournalArchiveDirectory for the transaction log.
  • After 7 days, journal files will be deleted.

AMPS will run this maintenance plan every day at 21:30 (9:30 PM) local time.

When journal files are moved to the archive directory, they continue to be part of the transaction log, but they do not have to be on the same device as the JournalDirectory. Most often, a production installation of AMPS will keep journal files that are very active on fast storage, and keep a longer period of history on storage that is higher capacity and lower cost. Since these devices typically also have lower throughput, these devices are best for files that must still be retained, but are infrequently used.

Full details on these options are available in the AMPS User Guide and the AMPS Configuration Reference.

Further Reading

See Transactional Messaging and Bookmark Subscriptions in the AMPS User Guide for a more complete discussion of the transaction log and message replay.

The AMPS client libraries include samples for publishing messages and replaying messages from the transaction log. See the client library distribution for those samples. (Notice that some clients are distributed as pre-built binaries. For those clients, the full distribution contains the samples.)