4. State of the World (SOW)

One of the core features of AMPS is the ability to persist the most recent update for each message matching a topic. The State of the World can be thought of as a database where messages published to AMPS are filtered into topics, and where the topics store the latest update to a message. Since AMPS subscriptions are based on the combination of topics and filters, the State of the World (SOW) gives subscribers the ability to quickly resolve any differences between their data and updated data in the SOW by querying the current state of a topic, or a set of messages inside a topic.

How Does the State of the World Work?

Much like tables in a relational database, topics in the AMPS State of the World persist the most recent update for each message. AMPS identifies a message by using a unique key for the message. The SOW key for a given message is similar to the primary key in a relational database: each value of the key is a unique message. The first time a message is received with a particular SOW key, AMPS adds the message to the SOW. Subsequent messages with the same SOW key value update the message.

There are several ways to create a SOW key for a message:

  • Most applications specify that AMPS assigns a SOW key based on the content of the message. The fields to use for the key are specified in the SOW topic definition, and consist of one or more XPath expressions. AMPS finds the specified fields in the message and computes a SOW key based on the name of the topic and the values in these fields. 60East recommends this approach unless an application has a specific need for a different approach.
  • A topic can also be configured to require that a publisher provide a SOW key for each message when publishing the message to AMPS.
  • AMPS also supports the ability for custom SOW key generation logic to be defined in an AMPS module, which will be invoked to generate the SOW key for each message. While these SOW keys are generated automatically by AMPS, rather than being provided by the publisher, the logic to generate these keys is provided by the module, and the configuration required (if any) is determined by the module.

The following diagrams demonstrate how the SOW works, using a SOW topic that is configured to have AMPS determine the SOW key based on the /orderId field within the message. As each message comes in, AMPS uses the contents of the /orderId field to generate a SOW key for the message. The SOW key is used to identify unique records in the SOW, so AMPS will store a distinct record for each distinct /orderId value published to this topic. The calculated SOW key will be returned in the SowKey header of messages received from the topic in the SOW.

../_images/sow_overview_1.svg

Figure 4.1: A SOW topic named ORDERS with a key definition of /orderId

In Figure 4.1, two messages are published where neither of the messages have matching keys existing in the ORDERS topic, the messages are both inserted as new messages. Some time after these messages are processed, an update comes in for the order with an orderId of 2. This message changes the price from 120 to 95. Since the incoming message has an orderId of 2, this matches an existing record and overwrites the existing message for the same SOW key, as seen in Figure 4.2. AMPS replaces the entire record with the contents of the update.

../_images/sow_overview_2.svg

Figure 4.2: Updating the IBM record by matching incoming message keys

Although the SOW key is derived from the content of the message in many cases, the SOW key is distinct from the content of the message. Each record in a SOW topic has a distinct SOW key, which is stored with the record.

By default, a topic recorded in the State of the World is persistent. For these topics, AMPS stores the contents of the state of the world for that topic in a dedicated, memory-mapped file. This means that the total state of the world does not need to fit into memory, and that the contents of the state of the world database are maintained across server restarts. You can also define a transient state of the world topic, which does not store the contents of the SOW to a persisted file.

The state of the world file is separate from the transaction log, and you do not need to configure a transaction log to use a SOW. When a transaction log is present that covers the SOW topic, on restart AMPS uses the transaction log to keep the SOW up to date. When the latest transaction in the SOW is more recent than the last transaction in the transaction log (for example, if the transaction log has been deleted), AMPS takes no action. If the transaction log has newer transactions than the SOW, AMPS replays those transactions into the SOW to bring the SOW file up to date. If the SOW file is missing or damaged, AMPS rebuilds the state of the world by replaying the transaction log from the beginning of the log.

When the State of the World for a topic is transient, AMPS does not store the state of the world for this topic across restarts. In this case, AMPS will synchronize the state of the world with the transaction log when the server starts by default. You can use the RecoveryPoint contfiguration option to specify that the topic should have only new publishes, or should recover from a specific point in time (for example, you could use an environment variable to provide a timestamp to the RecoveryPoint so that AMPS recovers only the last day’s worth of messages.)

Queries

At any point in time, applications can issue SOW queries to retrieve all of the messages that match a given topic and content filter. When a query is executed, AMPS will test each message in the SOW against the content filter specified and all messages matching the filter will be returned to the client. The topic can be a literal topic name or a regular expression pattern. For more information on issuing queries, please see SOW Queries in the AMPS User Guide.

Configuration

Topics where SOW persistence is desired are individually configured within the SOW section of the configuration file. Each topic will be defined with a Topic section enclosed within SOW. The AMPS Configuration Reference contains a description of the attributes that can be configured per topic. TopicMetaData is a synonym for SOW provided for compatibility with previous versions of AMPS. Likewise, TopicDefinition is a synonym for the Topic element of the SOW section, provided for compatibility with versions of AMPS prior to 5.0.

For the set of configuration options available in a SOW topic, see SOW/Topic in the AMPS Configuration Reference.

The listing in Example 4.1 is an example of using Topic to add a SOW topic to the AMPS configuration. One topic named ORDERS is defined as having key /invoice, /customerId and MessageType of json. The persistence file for this topic be saved in the sow/ORDERS.json.sow file. For every message published to the ORDERS topic, a unique key will be assigned to each record with a unique combination of the fields /invoice and /customerId. A second topic named ALERTS is also defined with a MessageType of xml keyed off of /client/id. The SOW persistence file for ALERTS is saved in the sow/ALERTS.xml.sow file.

<SOW>
    <Topic>
        <Name>ORDERS</Name>
        <FileName>sow/%n.sow</FileName>
        <Key>/invoice</Key>
        <Key>/customerId</Key>
        <MessageType>json</MessageType>
        <SlabSize>1MB</SlabSize>
        <HashIndex>
            <Key>/region</Key>
        </HashIndex>
    </Topic>

    <Topic>
        <Name>ALERTS</Name>
        <FileName>sow/%n.sow</FileName>
        <Key>/alert/id</Key>
        <MessageType>xml</MessageType>
        <!-- Pregenerate an index for the /alert/type element. This is seldom necessary,
             since AMPS will generate the index when it is needed, but the directive is included here
             for example purposes. -->
        <Index>/alert/type</Index>
    </Topic>
</SOW>

Example 4.1: Sample SOW Configuration

tip

Topics are scoped by their message type.

For example, two topics named Orders can be created one which supports MessageType of json and another which supports MessageType of xml.

Each of the MessageType entries that are defined for the Orders topic will require that Transport in the configuration file can accept messages of that type. Otherwise, there is no way for a publisher to publish messages of that type to this instance or for a subscriber to receive messages of that type from this instance.

This means that messages published to the Orders topic must know the type of message they are sending (json or xml) and the port defined by the transport.