25. Replicating Messages Between Instances

AMPS has the ability to replicate messages to downstream AMPS instances once those messages are stored to a transaction log. Replication in AMPS involves the configuration of two or more instances designed to share some or all of the published messages. With AMPS replication, an upstream (or source) instance delivers messages to a downstream instance.

Replication is typically used to improve the availability of a set of AMPS instances by creating a set of servers that hold the same messages, where each server can take over for the others in the event of a network or server failure.

Since AMPS replication operates on individual messages, replication is also an efficient way to split and share message streams between multiple sites where each downstream site may only want a subset of the messages from the upstream instances.

AMPS replication uses a leaderless, “all nodes hot” model. Any instance in a replication fabric can accept publishes and updates at any time, and there is no need for a message to be replicated downstream before it is delivered to subscribers (unless a subscriber explicitly requests otherwise using the fully_durable option on a bookmark replay).

Tip

The only communication between instances of AMPS is through replication. AMPS instances do not share state through the filesystem or any out-of-band communication.

AMPS high availability and replication do not rely on a quorum or a controller instance. Each instance of AMPS processes messages independently. Each instance of AMPS manages connections and subscriptions locally.

AMPS supports two forms of message acknowledgment for replication links: synchronous and asynchronous; these settings control when AMPS considers a message persisted. This controls when AMPS sends publishers persisted acknowledgments and the point at which AMPS considers a message to be fully persisted for bookmark subscribers. These settings do not affect when or how messages are replicated, or when or how messages are delivered to subscribers unless a subscriber explicitly requests this behavior. These settings only affect when AMPS acknowledges to the publisher that the message has been persisted. They do not affect the speed of replication, the priority of replication, or the guarantees that AMPS makes for ensuring that all replicated messages are acknowledged by the downstream instance before they can be removed from the transaction log.

AMPS replication consists of a message stream (or, more precisely, a command stream) provided to downstream instances. AMPS replicates the messages produced as a result of publish and delta_publish and replicates sow_delete commands. AMPS does not replicate messages produced internally by AMPS, such as the results of Views or updates sent to a ConflatedTopic. When replicating queues, AMPS also uses the replication connection to send and receive administrative commands related to queues, as described in the section on Replicated Queues.

Important

60East recommends that any server that participates in replication and that accepts publishes, SOW deletes, or queue acknowledgments directly from applications is configured with at least one sync replication destination.

Without at least one destination that uses sync acknowledgment, an AMPS instance could be a single point of failure, resulting in possible message loss if the instance has an unrecoverable failure (such as a hardware failure) between the time that the server acknowledges a command to a client, and the time that a replication destination receives and persists the command.

Notice that when replicating the results of a delta_publish command, and publishes to a topic that provides preprocessing and enrichment, AMPS replicates the fully processed and merged message – exactly the information that is written to the local transaction log and sent to subscribers. AMPS does not save the original command to the transaction log or replicate the original command. Instead, AMPS stores the message produced by the command and replicates that message.

Replication Basics

Before planning an AMPS replication topology, it can be helpful to understand the basics of how AMPS replication works. This section presents a general overview of the concepts that are discussed in more detail in the following sections.

  • Replication is point-to-point. Each replication connection involves exactly two AMPS instances: a source (that provides messages) and a destination that receives messages.

  • Replication is always “push” replication. In AMPS, the source configures a destination, and pushes messages to that destination. (Notice that it is possible to configure the source to wait for the destination to connect rather than actively making an outgoing connection, but replication is still a “push” from the source to the destination once that connection is made). The source must be configured to push messages to the destination, and the source guarantees that all messages to be replicated must be acknowledged by the destination before they can be removed from the transaction log.

  • Replication is one-link by default. By default, an instance of AMPS only replicates messages that are published to that instance by a client. Optionally, an instance of AMPS can be configured to replicate messages that arrive over replication. Adding this configuration is typically required if there are more than two instances in a replicated set of AMPS instances. See PassThrough Replication for details.

  • Replication relies on the transaction log. AMPS replicates the commands as preserved in the transaction log. This means, for example, that the results of delta publishes are replicated as fully-merged messages, since fully-merged messages are stored in the transaction log. Likewise, if duplicate messages arrive over different paths, only the first message to arrive will be stored in the transaction log, and that message is the one that will be replicated.

  • Replication provides a command stream. In AMPS replication, the server replicates the results of publish, delta_publish and sow_delete commands once those results are written to the transaction log. Each individual command is replicated, for low latency and fine-grained control of what is replicated. If a command is not in the transaction log (for example, a maintenance action has removed the journal that contains that command, or the command is for a topic that is not recorded in the transaction log), that command will not be replicated.

    Replication is intended to guarantee that the command stream for a set of topics on one instance is present on the other instance, with the ordering of each message source preserved. This means that there can be only one connection from a given upstream instance to a given downstream instance.

  • Replication is customizable by topic, message type, and content. AMPS can be configured to replicate the entire transaction log, or any subset of the transaction log. This makes it easy to use replication to populate view servers, test environments, or similar instances that require only partial views of the source data.

  • Replication guarantees delivery. AMPS will not remove a journal file until all messages in that journal file have been replicated to, and acknowledged by, the destination.

  • Replication is composable. AMPS is capable of building a sophisticated replication topology by composing connections. For example, full replication between two servers is two point-to-point connections, one in each direction. The basic point-to-point nature of connections makes it easy to reason about a single connection, and the composable nature of AMPS replication allows you to build replication networks that provide data distribution and high availability for applications across data centers and around the globe.

  • Replication acknowledgment is configurable. The acknowledgment mode provides different guarantees: async acknowledgment provides durability guarantees for the local instance, whereas sync acknowledgment provides durability guarantees for the local instance and the downstream instance.

  • Group identifies a set of instances that are intended to be fully equivalent for the purposes of message contents, application failover, and AMPS replication failover. Instances that are not intended to be fully equivalent for all of these purposes should be given a different Group name, even if they are in the same data center or geographic location, or if they would be treated as equivalent for some, but not all, purposes.

More details on each of these points is provided in this section.

Benefits of Replication

Replication can serve two purposes in AMPS:

  1. It can increase the fault-tolerance of AMPS by creating a spare instance to cut over to when the primary instance fails.
  2. Replication can be used in message delivery to a remote site.

In order to provide fault tolerance and reliable remote site message delivery, for the best possible messaging experience, there are some guarantees and features that AMPS has implemented. Those features are discussed below.

Replication in AMPS supports filtering by both topic and by message content. This granularity in filtering allows replication sources to have complete control over what messages are sent to their downstream replication instances.

Additionally, replication can improve availability of AMPS by creating a redundant instance of an AMPS server. Using replication, all of the messages which flow into a primary instance of AMPS can be replicated to a secondary spare instance. This way, if the primary instance should become unresponsive for any reason, then the secondary AMPS instance can be swapped in to begin processing message streams and requests.

Tip

When an AMPS instance is a replication source, that instance guarantees that messages will not be removed from the transaction log until all destinations have acknowledged the message.

Configuring Replication

Replication configuration involves the configuration of two or more instances of AMPS. For testing purposes both instances of AMPS can reside on the same physical host before deployment into a production environment. When running both instances on one machine, the performance characteristics will differ from production, so running both instances on one machine is more useful for testing configuration correctness than testing overall performance.

Any instance that is intended to receive messages via replication must define an incoming replication transport as one of the Transports for the instance. An instance may have only one incoming replication transport.

Any instance that is intended to replicate messages to another instance must specify a Replication stanza in the configuration file with at least one Destination. An instance can have multiple Destination declarations: each one defines a single outgoing replication connection.

In AMPS replication, instances should only be configured as part of the same Group if they are fully equivalent. That is, not only should they contain the same messages, but they should be considered failover alternatives for applications and other AMPS servers. If two servers are not intended to be fully replicated (for example, if there is one-way replication between a production server and a test server), they should have different Group values.

Caution

It’s important to make sure that when running multiple AMPS instances on the same host there are no conflicting ports. AMPS will emit an error message and will not start properly if it detects that a port specified in the configuration file is already in use.

For the purposes of explaining this example, we’re going to assume a simple hot-hot replication case where we have two instances of AMPS - the first host is named amps-1 and the second host is named amps-2. Each of the instances are configured to replicate data to the other. That is, all messages published to amps-1 are replicated to amps-2 and vice versa. This configuration ensures that a message published to one instance is available on the other instance in the case of a failover (although, of course, the publishers and subscribers should also be configured for failover).

Caution

Every instance of AMPS that will participate in replication must have a unique Name among all of the instances that are part of replication.

All instances that have the same Group must be able to be treated as equivalent by AMPS replication and AMPS clients.

If two instances of AMPS should be treated differently (for example, one instance receives publishes while the other is a read-only instance that receives one-way replication), those instances should be in different groups.

We will first show the relevant portion of the configuration used in amps-1, and then we will show the relevant configuration for amps-2.

Tip

All topics to be replicated must be recorded in the transaction log. The examples below omit the transaction log configuration for brevity. Please reference the Transaction Log chapter for information on how to configure a transaction log for a topic.

<AMPSConfig>
    <Name>amps-1</Name>
    <Group>DataCenter-NYC-1</Group>
    ...

    <Transports>
        <Transport>

            <!-- The amps-replication transport is required.
                 This is a proprietary message format used by
                 AMPS to replicate messages between instances.
                 This AMPS instance will receive replication messages
                 on this transport. The instance can receive
                 messages from any number of upstream instances
                 on this transport.

                 An instance of AMPS may only define one incoming replication
                 transport.  -->

            <Name>amps-replication</Name>
            <Type>amps-replication</Type>
            <InetAddr>10004</InetAddr>
        </Transport>

  <!--
        Transports for application use also need to
        be defined. An amps-replication transport can
        only be used for replication -->

   ... transports for client use here ...

    </Transports>

    ...

    <!-- All replication destinations are defined inside the Replication block. -->

    <Replication>

        <!--
             Each individual replication destination defines outgoing
             replication, that is, messages being replicated from
             this instance of AMPS to another instance of AMPS.
          -->

        <Destination>

            <!-- The replicated topics and their respective message types are defined here. AMPS
                allows any number of Topic definitions in a Destination. -->
            <Topic>
                <MessageType>fix</MessageType>

                <!-- The Name definition specifies the name of the topic or topics to be replicated.
                    The Name option can be either a specific topic name or a regular expression that
                    matches a set of topic names. -->
                <Name>topic</Name>

            </Topic>
            <Topic>
              <!-- Replicate any topic that uses the JSON message type
                   and that starts with /orders -->
              <MessageType>json</MessageType>
              <Name>^/orders/</Name>
            </Topic>

            <Name>amps-2</Name>

            <!-- Fully synchronize messages, including messages that
                 were not originally published to this instance. -->
            <PassThrough>.*</PassThrough>

            <!-- The group name of the destination instance (or instances). The name specified
                here must match the Group defined for the remote AMPS instance, or AMPS reports
                an error and refuses to connect to the remote instance. -->
            <Group>DataCenter-NYC-1</Group>

            <!-- Replication acknowledgment can be either synchronous or
                 asynchronous. This does not affect the speed or priority of
                 the connection, but does control when this instance will
                 acknowledge the message as safely persisted. -->

            <SyncType>sync</SyncType>

            <!-- The Transport definition defines the location to which this AMPS instance will
                replicate messages. The InetAddr points to the hostname and port of the
                downstream replication instance. The Type for a replication instance should
                always be amps-replication. -->
            <Transport>

                <!-- The address, or list of addresses, for the replication destination. -->
                <InetAddr>amps-2-server.example.com:10005</InetAddr>
                <Type>amps-replication</Type>
            </Transport>
        </Destination>
    </Replication>

    ...

</AMPSConfig>

Configuration used for amps-1

For the configuration of amps-2, we will use the following example. While this example is similar, only the differences between the amps-1 configuration will be called out.

<AMPSConfig>
    <Name>amps-2</Name>
    <Group>DataCenter-NYC-1</Group>

    ...


    <Transports>

            <!-- The amps-replication transport is required
                 This is a proprietary message format used by
                 AMPS to replicate messages between instances.
                 This AMPS instance will receive replication messages
                 on this transport. The instance can receive
                 messages from any number of upstream instances
                 on this transport.

                 An instance of AMPS may only define one incoming replication
                 transport.  -->

        <Transport>
            <Name>amps-replication</Name>
            <Type>amps-replication</Type>

            <!-- The port where amps-2 listens for replication messages matches the port where
                amps-1 is configured to send its replication messages. This AMPS instance will
                receive replication messages on this transport. The instance can receive
                messages from any number of upstream instances on this transport. -->
            <InetAddr>10005</InetAddr>
        </Transport>
    </Transports>

    ...

    <Replication>
        <Destination>

            <!-- The Topic definitions for amps-2 match
                 the definitions for amps-1 so that these
                 topics contain the same messages in the
                 transaction log on both instances. -->

            <Topic>
                <MessageType>fix</MessageType>
                <Name>topic</Name>
            </Topic>
            <Topic>
              <MessageType>json</MessageType>
              <Name>^/orders/</Name>
            </Topic>
            <Name>amps-1</Name>

            <!-- Fully synchronize messages, including messages that
                 were not originally published to this instance. -->
            <PassThrough>.*</PassThrough>

            <Group>DataCenter-NYC-1</Group>

            <SyncType>sync</SyncType>
            <Transport>

                <!-- The replication destination port for amps-2 is configured to send replication
                    messages to the same port on which amps-1 is configured to listen for them. -->
                <InetAddr>amps-1-server.example.com:10004</InetAddr>
                <Type>amps-replication</Type>
            </Transport>
        </Destination>
    </Replication>

    ...

</AMPSConfig>

Configuration for amps-2

These example configurations replicate the topic named topic of the message type nvfix and any topic of the message type json that begins with /orders/ between the two instances of AMPS. To replicate more topics, these instances could add additional Topic blocks.

Sync vs Async Acknowledgment

When publishing to a topic that is recorded in the transaction log, it is recommended that publishers request a persisted acknowledgment. The persisted acknowledgment message is how AMPS notifies the publisher that a message received by AMPS is considered to be safely persisted, as specified in the configuration. (The AMPS client libraries automatically request this acknowledgment on each publish command when a publish store is present for the client – that is, any time that the client is configured to ensure that the publish is received by the AMPS server.)

Depending on the replication destination configuration for the AMPS instance that receives the message, that persisted acknowledgment message will be delivered to the publisher at different times in the replication process.

There are two options: synchronous or asynchronous acknowledgment. These two acknowledgment SyncType options control when publishers of messages are sent persisted acknowledgments.

For synchronous replication acknowledgments, AMPS will not return a persisted acknowledgment to the publisher for a message until the message has been stored to the local transaction log, to the SOW, and all downstream synchronous replication destinations have acknowledged the message. The figure below shows the cycle of a message being published in a replicated instance, and the persisted acknowledgment message being returned back to the publisher. Notice that, with this configuration, the publisher will not receive an acknowledgment if the remote destination is unavailable. 60East recommends that when you use sync replication, you consider setting a policy for downgrading the link when a destination is offline, as described in Downgrading Acknowledgments for a Destination.

Synchronous acknowledgment -- no acknowledgment to publisher until the downstream instance acknowledges

Sequence for a Connection Using Sync Acknowledgment

For destinations configured with async replication acknowledgment, an AMPS instance does not require that the destination acknowledge that the message is persisted before acknowledging the message to the publisher. That is, for async acknowledgment, the AMPS instance replicating the message sends the persisted acknowledgment message back to the publisher as soon as the message is stored in the local transaction log and SOW stores.

The acknowledgment type (SyncType) has no effect on how an instance of AMPS replicates the message to other instances of AMPS. The acknowledgment type only affects whether an instance of AMPS must receive an acknowledgment from that Destination before it will acknowledge a message as having been persisted.

The figure below shows the cycle of a message being published with a SyncType configuration set to async acknowledgment.

Aynchronous acknowledgment -- acknowledgment to publisher immediately, independent of downstream replication

Sequence for a Connection Using Async Acknowledgment

By default, replication destinations do not affect when a message is delivered to a subscription. Optionally, a subscriber can request the fully_durable option on a bookmark subscription (that is, a replay from the transaction log). When the fully_durable option is specified, AMPS does not deliver a message to that subscriber until all replication destinations configured for sync acknowledgment have acknowledged the message.

Caution

Every instance of AMPS that accepts publish commands, SOW delete commands or allows consumption of messages from queues, should specify at least one destination that uses sync acknowledgment. If a publish or queue consumer may fail over between two (or more) instances of AMPS, those instances should specify sync acknowledgment between them to prevent a situation where a message could be lost if an instance fails immediately after acknowledging a message to a publisher.

A destination configured for sync acknowledgment can be downgraded to async acknowledgment while AMPS is running. This can be useful in cases where a server is offline for an extended period of time due to hardware failure or persistent network issues.

Downgrading Acknowledgments for a Destination

The AMPS administrative console provides the ability to downgrade a replication link from synchronous to asynchronous acknowledgment. This feature is useful to relieve memory or storage pressure on publishers should a downstream AMPS instance prove unstable, unresponsive, or be experiencing excessive latency to the point that it should be considered to be offline.

Downgrading a replication link to asynchronous means that any persisted acknowledgment message that a publisher may be waiting on will no longer wait for the downstream instance to confirm that it has committed the message to its downstream Transaction Log or SOW store. AMPS immediately considers the downstream instance to have acknowledged the message for existing messages, which means that if AMPS was waiting for acknowledgment from that instance to deliver a persisted acknowledgment, AMPS immediately sends the persisted acknowledgment when the instance is downgraded.

The result of a link being downgraded is:

  • The number of messages that the publisher must retain is reduced, but
  • The downgraded link is unsafe for the publisher to fail over to
  • The downgraded link is unsafe for a bookmark subscriber to fail over to

Automatic downgrade is most suitable for a situation where an instance should be considered offline or unavailable.

Caution

Downgrading a destination means that this instance will not wait for that destination to acknowledge a message before acknowledging that message to publishers or upstream instances. It does not affect any other behavior of the instance.

A publisher or queue consumer must not fail over from this instance to a destination that has been downgraded to async acknowledgment. This can cause message loss, since the upstream instance may have acknowledged a message that the downstream instance has not yet processed.

A bookmark subscriber must not fail over from this instance to a destination that has been downgraded to async acknowledgment. This can cause replay gaps, since that destination is no longer considered when determining whether a message is persisted.

AMPS can be configured to automatically downgrade a replication link to asynchronous if the remote side of the link cannot keep up with persisting messages or becomes unresponsive. This option prevents unreliable links from holding up publishers, but increases the chances of a single instance failure resulting in message loss, as described above. AMPS can also be configured to automatically upgrade a replication link that has previously been downgraded. If an instance is configured to downgrade acknowledgment, it should also be configured to upgrade acknowledgment.

Automatic downgrade is implemented as an AMPS action. To configure automatic downgrade, add the appropriate action to the configuration file as shown below:

<AMPSConfig>
    ...
    <Actions>
        <Action>
            <On>
                <Module>amps-action-on-schedule</Module>
                <Options>

                    <!--This option determines how often AMPS checks whether destinations have fallen
                    behind. In this example, AMPS checks destinations every 15 seconds. In most
                    cases, 60East recommends setting this to half of the Interval setting. -->
                    <Every>15s</Every>
                </Options>
            </On>
            <Do>
                <Module>amps-action-do-downgrade-replication</Module>
                <Options>

                    <!--The maximum amount of time for a destination to fall behind. If AMPS has been
                     waiting for an acknowledgment from the destination for longer than the
                     Interval, AMPS downgrades the destination. In this example, AMPS downgrades any
                     destination for which an acknowledgment has taken longer than 300 seconds. -->
                    <Age>300s</Age>
                </Options>
            </Do>
            <Do>
                <Module>amps-action-do-upgrade-replication</Module>
                <Options>

                   <!-- The threshold for upgrading the replication link back to sync
                        acknowledgment. If the destination is behind by less than this
                        amount, and was previously downgraded to async acknowledgment,
                        AMPS will upgrade to sync acknowledgment.

                        -->

                    <Age>10s</Age>
                </Options>
            </Do>
        </Action>
    </Actions>
   ...
</AMPSConfig>

In this configuration file, AMPS checks every 15 seconds to see if a destination has fallen behind by 300 seconds. If a destination has fallen behind by more than 300 seconds, that destination should no longer be considered reliable. AMPS downgrades the destination to async acknowledgment. That destination will no longer be considered when acknowledging messages to publishers, and publishers should not consider that destination to be an option for failover until the destination catches up.

Tip

All publishers using a publish store should be able to hold a number of messages equal to the number of messages published, at peak message volume, for a time period equal to the periodicity of the downgrade check plus the threshold for downgrade. With the configuration above, a publisher that publishes at a peak rate of 10,000 messages per second should, at a minimum, be able to allocate a publish store that holds 750,000 messages.

In some cases, it is important that a destination maintain a minimum number of destinations that use sync acknowledgment. For those cases, an instance-level Tuning parameter is available that will prevent the action from downgrading a connection if doing so would reduce the number of destinations that use sync acknowledgment below the configured limit. This parameter does not guarantee whether a specific destination will continue using sync acknowledgment. This parameter only limits whether AMPS will downgrade a destination that meets downgrade criteria. AMPS will not upgrade a destination that has previously been downgraded if a connection is lost, even if this means that the number of currently connected destinations that use sync acknowledgment is less than the configured minimum. See the AMPS Configuration Guide section on instance-level parameters for details.

Replication Configuration Validation

Replication configuration validation helps to ensure that any configuration that could result in message loss or inconsistent message contents between two instances of AMPS is explicitly designed into the replication topology and not the result of accidental misconfiguration.

Replication can involve coordinating configuration among a large number of AMPS instances. It can sometimes be difficult to ensure that all of the instances are configured correctly, and to ensure that a configuration change for one instance is also made at the replication destinations. For example, if a high-availability pair replicates the topics ORDERS, INVENTORY, and CUSTOMERS to a remote disaster recovery site, but the disaster recovery site only replicates ORDERS and INVENTORY back to the high-availability pair, disaster recovery may not occur as planned. Likewise, if only one member of the HA pair replicates ORDERS to the other member of the pair, the two instances will contain different messages, which could cause problems for the system.

Starting in the 5.0 release, AMPS automatic replication configuration validation makes it easier to keep configuration items consistent across a replication fabric.

Replication configuration validation happens when a replication connection is made between two instances. The validation compares the configuration of those two instances. By default, any difference in configuration that could result in message loss, different behavior between the source instance and the destination instance, or different replication guarantees between the source instance and the destination instance is reported as an error.

Important

When replication validation reports an error, the reason for the error is logged to the event log on the instance that detects the problem, and the connection is closed.

Automatic configuration validation is enabled for all elements of the replication configuration by default. You can turn off validation for specific elements of the configuration, as described below.

AMPS replication uses a leaderless, “all nodes hot” model. This means that no single AMPS instance has a view of the entire replication fabric, and a single AMPS instance will always assume that there are instances in the replication fabric that it is not aware of. The replication validation rules are designed with this assumption. The advantage of this assumption is that if instances are added to the replication fabric, it is typically only necessary to change configuration on the instances that they directly communicate with for replication to function as expected. The tradeoff, however, is that it is sometimes necessary to configure an instance as though it were part of a larger fabric (or exclude a validation rule) even in a case where the instance is part of a much simpler replication design.

Each Topic in a replication Destination can configure a unique set of validation checks. By default, all of the checks apply to all topics in the Destination.

When troubleshooting a configuration validation error, it is important to look at the AMPS logs on both sides of the connection. Typically, the AMPS instance that detects the error will log complete information on the part of validation that failed and the changes required for the connection to succeed, while the other side of the connection will simply note that the connection failed validation. This means that if a validation error is reported on one instance, but details are not present, the other side of the connection detected the error and will have logged relevant details.

Warning

Excluding a validation check directs AMPS to make a replication connection that could result in inconsistent state or data loss. Use caution when excluding validation checks. See the table below for details on each validation check.

The table below lists aspects of replication that AMPS validates. By default, replication validation treats the downstream instance as though it is intended to be a full highly-available failover partner for any topic that is replicated. For situations where that is not the case, many validation rules can be excluded. For example, if the downstream instance is a view server that does not accept publishes and, therefore, is not configured to replicate a particular topic back to this instance, the replicate validation check might need to be excluded.

AMPS performs the following validation checks:

Replication validation checks
Check Validates
txlog

The topic is contained in the transaction log of the remote instance.

An error on this validation check indicates that this instance is replicating a topic that is not in the transaction log on the downstream instance. This means that the downstream instance is not persisting the messages in a way that can be used for replication, replay, or used as the basis for a queue.

replicate

The topic is replicated from the remote instance back to this instance.

An error on this validation check indicates that this instance is replicating a topic to the downstream instance that is not being replicated back to this instance. This means that any publishes or updates to the topic on the downstream instance are not replicated back to this instance.

sow

If the topic is a SOW topic in this instance, it must also be a SOW topic in the remote instance.

An error on this validation check indicates that this instance is replicating a topic to the downstream instance that is a SOW/Topic on this instance, but is not a SOW/Topic on the downstream instance. This means that the topic has different behavior on the downstream instance, and does not maintain the current value of records in the topic in the SOW.

cascade

The remote instance must enforce the same set of validation checks for this topic as this instance does.

When relaxing validation rules for a topic that the downstream instance itself replicates, adding an exclusion for cascade is often necessary as well.

An error on this validation check indicates that this instance enforces a validation check for a topic that the downstream instance does not enforce when that instance replicates the topic.

To understand the impact of this validation check, consider the validation checks that the downstream instance enforces. If the downstream instance enforces the appropriate validation checks, this instance can exclude the cascade check.

Tip

It is sometimes necessary to exclude this check as part of a rolling upgrade, and then to leave this exclusion in place until all instances can be taken offline at the same time. If the cascade check is the only check being excluded on any instance, the topology can be considered to meet validation rules (and the cascade exclusion can be safely removed during a maintenance window when all of the instances can be updated simultaneously).

queue

If the topic is a queue in this instance, it must also be a queue in the remote instance.

A distributed queue will not function correctly if one of the instances it is replicated to does not define the topic as a queue.

This option cannot be excluded.

keys

If the topic is a SOW topic in this instance, it must also be a SOW topic in the remote instance and the SOW in the remote instance must use the same Key definitions.

An error on this validation check indicates that this instance is replicating a topic to the downstream instance that is a SOW/Topic on both instances, but that the definition of message identity (the Key configuration for the topic) does not match on the two instances. This means that the contents of this topic may be different on these two instances for the same set of messages published.

replicate_filter

If this topic uses a replication filter, the remote instance must use the same replication filter for replication back to this instance.

An error on this validation check indicates that this instance uses a replication filter for a topic that the downstream instance does not use when it replicates the topic. This means that, given the same set of publishes, the downstream instance may replicate a different set of messages than are replicated to that instance. This would produce inconsistent data across the set of replicated instances.

queue_passthrough

If the topic is a queue in this instance, the remote instance must support passthrough from this group.

An error on this validation check indicates that this instance does not pass through messages for one or more groups that the queue is replicated from. This could lead to a situation where a queue message is undeliverable if a network connection is unavailable or if additional instances are added to the set of instances that contain the queue.

queue_underlying

If the topic is a queue in this instance, it must use the same underlying topic definition and filters in the remote instance.

This option cannot be excluded.

Replication Configuration Validation

Notice that, by default, all of these checks are applied.

The sample below shows how to exclude validation checks for a replication destination. In this sample, the Topic does not require the remote destination to replicate back to this instance, and does not require that the remote destination enforce the same configuration checks for any downstream replication of this topic.

<Destination>
    ...
    <Topic>
        <MessageType>json</MessageType>
        <Name>MyStuff-VIEW</Name>
        <ExcludeValidation>replicate,cascade</ExcludeValidation>
    </Topic>
    ...
</Destination>

Replication Resynchronization

When a replication connection is established between AMPS instances, the upstream instance publishes any messages that it contains in its transaction log that the downstream instance may not have previously received. This process is called “replication resync”. During resync, the upstream instance replays from the transaction log, replicating messages that match the Topic (and, optionally, Filter) specification(s) for the downstream Destination.

When a replication connection is established between AMPS servers that are both version 5.3.3.0 or higher, the servers exchange information about the messages present in the transaction log to determine the earliest message in the transaction log on the upstream instance that is not present on the downstream instance. Replication resynchronization will begin at that point. This approach to finding the resynchronization point applies whether or not these two instances have had a replication connection before.

For replication between older versions of AMPS, replication resynchronization begins at the last point in the upstream instance’s transaction log that the downstream instance has received. For those versions of AMPS, messages from other instances are not considered when determining the resynchronization point.

Replication Compression

AMPS provides the ability to compress the replication connection. In typical use, using replication compression can greatly reduce the bandwidth required between AMPS instances.

The precise amount of compression that AMPS can achieve depends on the content of the replicated messages. Compression is configured at the replication source, and does not need to be enabled in the transport configuration at the instance receiving the replicated messages.

For AMPS instances that are receiving replicated messages, no additional configuration is necessary. AMPS automatically recognizes when an incoming replication connection uses compression.

See the AMPS Configuration Guide for enabling compression and choosing a compression algorithm.

Destination Server Failover

Your replication plan may include replication to a server that is part of a highly-available group.

There are two common approaches to destination server failover:

  1. Wide IP - AMPS replication works transparently with wide IP and many installations use wide IP for destination server failover. The advantage of this approach is that it requires no additional configuration in AMPS and redundant servers can be added or removed from the wide IP group without reconfiguring the instances that replicate to the group. A disadvantage to this approach is that failover can require several seconds and messages are not replicated during the time that it takes for failover to occur.
  2. AMPS Failover - AMPS allows you to specify multiple downstream servers in the InetAddr element of a destination. In this case, AMPS treats the defined list of servers as a list of equivalent servers, listed in order of priority.

When multiple addresses are specified for a destination, each time AMPS needs to make a connection to a destination and there is no incoming connection from a server in that destination, AMPS starts at the beginning of the list and attempts to connect to each address in the list. If AMPS is unable to connect to any address in the list, AMPS waits for a timeout period, then begins again with the first server on the list. Each time AMPS reaches the end of the list without establishing a connection, AMPS increases the timeout period. If an incoming connection from one of the servers on the list exists, AMPS will use that connection for outgoing replication. If multiple incoming connections from servers in the list exist, AMPS will choose one of the incoming connections to use for outgoing traffic.

This capability allows you to easily set up replication to a highly-available group. If the server you are replicating to fails over, AMPS uses the prioritized list of servers to re-establish a connection.

Two-Way Replication

Two-way replication, sometimes called Back Replication, is a term used to describe a replication scenario where there are two instances of AMPS – termed AMPS-A and AMPS-B for this example.

In a two-way replication configuration, messages that are published to AMPS-A are replicated to AMPS-B. Likewise, messages which are published to AMPS-B are replicated to AMPS-A. This replication scheme is used when both instances of AMPS need to be in sync with each other to handle a failover scenario with no loss of messages between them. This way, if AMPS-A should fail at any point, applications can immediately fail over to the AMPS-B instance, allowing message flow to resume with as little downtime as possible.

To enable two-way replication, each instance of AMPS defines a replication Transport to receive incoming replication messages. Each instance also defines a replication Destination to deliver messages to the other instance.

Notice that servers are intended to function as failover partners. Since a publisher may fail over between these two instances, the Destination on each instance that replicates to the other instance is configured to use sync message acknowledgment. This ensures that a publisher does not consider a message to be persisted until all of the failover partners have received and persisted the message.

Warning

When configuring a set of instances for failover, it is important that the instances use sync message acknowledgment among the set of instances that a given client will consider for failover. It should never be possible for a publisher to fail over from one instance to another instance if the replication link between those instances is configured for async acknowledgments.

Starting with the 5.0 release, when AMPS detects back replication between a pair of instances, AMPS will prefer using a single network connection between the servers, replicating messages in both directions (that is, to both destinations) over a single connection. This can be particularly useful for situations where you need to have messages replicated, but only one server can initiate a connection. For example, when one of the servers is in a DMZ, and cannot make a connection to a server within the company. AMPS also allows you to specify a replication destination with no InetAddr provided. In this case, the instance will replicate once the destination establishes a connection, but will not initiate a connection. When both instances specify an InetAddr, AMPS may temporarily create two connections between the instances while replication is being established. In this case, after detecting that there are two connections active, AMPS will close one of the network connections and allow both AMPS instances to use the remaining network connection to publish messages to the other instance. Notice that using a single network connection is simply an optimization to more efficiently use the available sockets. This does not change the way messages are replicated or the replication protocol, nor does it change the requirement that all messages in a journal are replicated to all destinations before a journal can be removed.

PassThrough Replication

PassThrough Replication is a term used to describe the ability of an AMPS instance to pass along replicated messages to another AMPS instance. This allows you to easily keep multiple failover or DR destinations in sync from a single AMPS instance. Unless passthrough replication is configured, an AMPS instance only replicates messages directly published to that instance from an application. By default, an instance does not re-replicate messages received over replication.

PassThrough replication uses the name of the originating AMPS group to indicate that messages that arrive at this instance of AMPS directly from that group are to be replicated to the specified destination. PassThrough replication supports regular expressions to specify groups, and allows multiple server groups per destination. Notice that if the destination instance does not specify a Group in its instance config, the group name is the Name of the instance.

To ensure that an instance replicates a full copy of its transaction log downstream (which is typically the intended result), include a PassThrough configuration item that matches any group name.

With care, some topologies can use a PassThrough configuration that only replicates messages directly published to the instance and a subset of messages received over replication. This can result in significant bandwidth reduction in some topologies, but must be configured with care to ensure that messages do not fail to reach all of the instances, since in this case AMPS is configured to replicate only a part of the transaction log of the local instance.

<Replication>
    <Destination>
        <Name>AMPS2-HKG</Name>
        <!-- No group specified: this destination is for
             a server at the same site, and is responsible for
             populating the specific replication partner. -->
        <Transport>
            <Name>amps-replication</Name>
            <Type>amps-replication</Type>
            <InetAddr>secondaryhost:10010</InetAddr>
        </Transport>
        <Topic>
            <Name>/rep_topic</Name>
            <MessageType>fix</MessageType>
        </Topic>
        <Topic>
            <Name>/rep_topic2</Name>
            <MessageType>fix</MessageType>
        </Topic>
        <SyncType>sync</SyncType>

        <!-- Specify which messages received via replication will be replicated
             to this destination (provided that the Topic and MessageType also match).

             This destination will receive messages that arrive via replication from
             AMPS instances with a group name that does not contain HKG. Replicated
             messages from an instance that has a group name that matches HKG will not be
             sent to this destination.

             Regardless of the PassThrough configuration, all messages published directly
             to this instance by an AMPS client will be replicated to this destination
             if the Topic and MessageType match.
        -->
        <PassThrough>^((?!HKG).)*$</PassThrough>
     </Destination>
</Replication>

When a message is eligible for passthrough replication, topic and content filters in the replication destination still apply. The passthrough directive simply means that the message is eligible for replication from this instance if it comes from an instance in the specified group.

AMPS protects against loops in passthrough replication by tracking the instance names or group names that a message has passed through. AMPS does not allow a message to travel through the same instance and group name more than once.

Caution

When using passthrough, AMPS will not send a message to an instance that the message has already traveled through to protect against replication loops.

If an instance replicates a queue (distributed queue) or a group local queue, it must also provide passthrough for any incoming replication group that replicates that topic (even if the incoming replication connection is from the same group that this instance belongs to). The reason for this is simple: AMPS must ensure that messages for a replicated queue, including acknowledgments and transfer messages, are able to reach every instance that hosts the queue if possible, even if a network connection fails or an instance goes offline. Therefore, this instance must pass through messages received from other instances that affect the queue.

Guarantees on Ordering

For each publisher, on a single topic, AMPS is guaranteed to deliver messages to subscribers in the same order that the messages were published by the original publisher. This guarantee holds true regardless of how many publishers or how many subscribers are connected to AMPS at any one time.

For each instance, AMPS is guaranteed to deliver messages in the order in which the messages were received by the instance, regardless of whether a message is received directly from a publisher or indirectly via replication. The message order for the instance is recorded in the transaction log, and is guaranteed to remain consistent across server restarts.

These guarantees mean that subscribers will not spend unnecessary CPU cycles checking timestamps or other message content to verify which message is the most recent, or reordering messages during playback. This frees up subscriber resources to do more important work.

AMPS preserves an absolute order across topics for a single subscription for all topics except views, queues, and conflated topics. Applications often rely on this behavior to correlate the times at which messages to different topics were processed by AMPS. See Message Ordering for more information.

Replication Security

AMPS allows authorization and entitlement to be configured on replication destinations. For the instance that receives connections, you simply configure Authentication and Entitlement for the transport definition for the destination, as shown below:

<Transports>
    <Transport>
        <Name>amps-replication</Name>
        <Type>amps-replication</Type>
        <InetAddr>10005</InetAddr>

        <!-- Specifies the entitlement module to use to check permissions for incoming
            connections. The module specified must be defined in the Modules section of the
            config file, or be one of the default modules provided by AMPS. This snippet
            uses the default module provided by AMPS for example purposes. -->
        <Entitlement>
            <Module>amps-default-entitlement-module</Module>
        </Entitlement>

        <!-- Specifies the authentication module to use to verify identity for incoming
            connections. The module specified must be defined in the Modules section of the
            config file, or be one of the default modules provided by AMPS. This snippet
            uses the default module provided by AMPS for example purposes. -->
        <Authentication>
            <Module>amps-default-authentication-module</Module>
       </Authentication>
    </Transport>
 ...
</Transports>

For incoming connections, configuration is the same as for other types of transports.

For connections from AMPS to replication destinations, you can configure an Authenticator module for the destination transport. Authenticator modules provide credentials for outgoing connections from AMPS. For authentication protocols that require a challenge and response, the Authenticator module handles the responses for the instance requesting access.

<Replication>
    <Destination>
        <Topic>
            <MessageType>fix</MessageType>
            <Name>topic</Name>
        </Topic>
        <Name>amps-1</Name>
        <SyncType>async</SyncType>
        <Transport>
            <InetAddr>amps-1-server.example.com:10004</InetAddr>
            <Type>amps-replication</Type>

            <!-- Specifies the authenticator module to use to provide credentials for the
                outgoing connection. The module specified must be defined in the Modules section
                of the config file, or be one of the default modules provided by AMPS. This
                snippet uses the default module provided by AMPS for example purposes. -->

            <Authenticator>
                <Module>amps-default-authenticator-module</Module>
            </Authenticator>
        </Transport>
    </Destination>
</Replication>

Understanding Replication Message Routing

An instance of AMPS will replicate a message to a given Destination when all of the following conditions are met (and there is an active replication connection to the Destination):

  1. The message must be recorded in the transaction log for this instance (that is, the topic that the message is published to must be recorded in the transaction log with the appropriate message type).
  2. The AMPS instance must be configured to replicate messages with that topic and message type to the Destination. (If a content filter is included in the replication configuration, the message must also match the content filter.)
  3. The message must either have been directly published to this instance, or if the message was received via replication, the Destination must specify a PassThrough rule that matches the AMPS instance that this instance received the message from.
  4. The message must not have previously passed through the Destination being replicated to; replication loops are not permitted.

Each instance that receives a message evaluates the conditions above for each Destination. The same process is followed when replaying messages from the transaction log while resynchronizing the downstream replication destination.

To verify that a given set of configurations replicates the appropriate messages, start at each instance that will receive publishes and trace the possible replication paths, applying these rules. If, at any time, applying these rules creates a situation where a message does not reach an instance of AMPS that is intended to receive the message, revise the rules (typically, by adjusting the topics replicated to a Destination or the PassThrough configuration for a Destination) until messages reach the intended instances regardless of route.

Maximum Downstream Destinations

AMPS has support for up to 64 downstream destinations that use synchronous acknowledgment. There is no explicit limit on the number of downstream destinations that use asynchronous (immediate) acknowledgment.

Replicated Queues

AMPS provides a unique approach to replicating queues. This approach is designed to offer high performance in the most common cases, while continuing to provide delivery model guarantees, resilience and failover in the event that one of the replicated instances goes offline.

When a queue is replicated, AMPS replicates the publish commands to the underlying topic, the sow_delete commands that contain the acknowledgment messages, and special queue management commands that are internal to AMPS.

Queue Message Ownership

To guarantee that no message is processed more than once, AMPS tracks ownership of the message within the network of replicated instances.

For a distributed queue (that is, a queue defined with the Queue configuration element), the instance that first receives the publish command from a client owns the message. Although all replicated instances downstream record the publish command in their transaction logs, they do not provide the message to queue subscribers unless that instance owns the message.

For a group local queue (that is, a queue defined with the GroupLocalQueue tag), the instance specified in the InitialOwner element for the queue owns a message when the message first enters the queue, regardless of where the message was originally published.

Only one instance can own a message at any given time. Other instances can request that the current owner transfer ownership of a message.

To transfer ownership, an instance that does not currently own the message makes a request to the current message owner. The owning instance makes an explicit decision to transfer ownership, and replicates the transfer notification to all instances to which the queue topic is replicated.

The instance that owns a message will always deliver the message to a local subscriber if possible. This means that performance for local subscribers is unaffected by the number of downstream instances. However, this also means that if the local subscribers are keeping up with the message volume being published to the queue, the owning instance will never need to grant a transfer of ownership.

A downstream instance will request ownership transfer if:

  1. The downstream instance has subscriptions for that topic with available backlog, and
  2. The amount of time since the message arrived at the instance is greater than the typical time between the replicated message arriving and the replicated acknowledgment arriving.

Notice that this approach is intended to minimize ungranted transfer requests. In normal circumstances, the typical processing time reflects the speed at which the local processors are consuming messages at a steady state. Downstream instances will only request messages that have been seen to exceed that time, indicating that the processors are not keeping up with the incoming message rate.

The instance that owns the message will grant ownership to a requesting instance if:

  1. The request is the first request received for this message, and
  2. There are no subscribers on the owning instance that can accept the message

When the owning instance grants the request, it logs the transfer in its transaction log and sends the transfer of ownership to all instances that are receiving replicated messages for the queue. When the owning instance does not grant the transfer of ownership, it takes no action.

Notice that your replication topology must be able to replicate acknowledgments to all instances that receive messages for the queue. Otherwise, an instance that does not receive the acknowledgments will not consider the messages to be processed. Replication validation can help to identify topologies that do not meet this requirement.

Tip

A barrier message is delivered immediately when there are no unacknowledged messages ahead of the barrier message in the queue on this instance, regardless of which instance owns the message. This means that for a distributed queue or group local queue, every queue that contains the barrier message will deliver the barrier message when all previous messages on that instance have been acknowledged.

Failover and Queue Message Ownership

When an instance that contains a queue fails or is shut down, that instance is no longer able to grant ownership requests for the messages that it owns. By default, those messages become unavailable for delivery, since there is no longer a central coordination point at which to ensure that the messages are only delivered once.

AMPS provides a way to make those messages available. Through the admin console, you can choose to enable_proxied_transfer, which allows an instance to act as an ownership proxy for an instance that has gone offline. In this mode, the local instance can assume ownership of messages that are owned by an offline instance.

Use this setting with care: when active, it is possible for messages to be delivered twice if the instance that had previously owned the message comes back online, or if multiple instances have proxied transfer enabled for the same queue.

In general, you enable_proxied_transfer as a temporary recovery step while one of the instances is offline, and then disable proxied transfer when the instance comes back online, or when all of the messages owned by that instance have been processed.

Configuration for Queue Replication

To provide replication for a distributed queue, AMPS requires that the replication configuration meet the following requirements:

  1. Provide bidirectional replication between the instances. In other words, if instance A replicates a queue to instance B, instance B must also replicate that queue to instance A.

  2. If the topic is a queue on one instance, it must be a queue on all replicated instances.

  3. On all replicated instances, the queue must use the same underlying topic definition and filters. For queues that use a regular expression as the topic definition, this means that the regular expression must be the same. For a GroupLocalQueue, the InitialOwner must be the same on all instances that contain the queue.

  4. The underlying topics must be replicated to all replicated instances (since this is where the messages for the queue are stored).

  5. Replicated instances must provide passthrough for instances that replicate queues. For example, consider the following replication topology: Instance A in group One replicates a queue to instance B in group Two. Instance B in group Two replicates the queue to instance C in group Three.

    For this configuration, instance B must provide passthrough for group Three to instance A, and must also provide passthrough for group One to instance C. The reason for this is to ensure that ownership transfer and acknowledgment messages can reach all instances that maintain a copy of the queue.

    Likewise, consider a topology where Instance X in GroupOne replicates a queue to Instance Y in GroupOne. Instance X must provide passthrough for GroupOne, since any incoming replication messages for the queue (for example, from Instance Z) that arrive at Instance X must be guaranteed to reach Instance Y. Otherwise, it would be possible for the queue on Instance Y to have different messages than Instance X and Instance Z if Instance Z does not replicate to Instance Y (or if the network connection between Instance Z and Instance Y fails).

Notice that the requirements above apply only to queue topics. If the underlying topic uses a different name than the queue topic, it is possible to replicate the messages from the underlying topic without replicating the queue itself. This approach can be convenient for simply recording and storing the messages provided to the queue on an archival or auditing instance. When only the underlying topic (or topics) are replicated, the requirements above do not apply, since AMPS does not provide queuing behavior for the underlying topics.

A queue defined with LocalQueue cannot be replicated. The data from the underlying topics for the queue can be replicated without special restrictions. The queue topic itself, however, cannot be replicated. AMPS reports an error if any LocalQueue topic is replicated.

Bootstrap Initialization of AMPS Replication

AMPS offers the ability to initialize replication by recovering the full current set of messages in an AMPS instance from another instance.

Bootstrap initialization can restore the messages in the transaction log of the source instance. Unlike replication, bootstrap initialization will restore messages that would not normally be replicated, including:

  • The full current contents of SOW topics (including messages that are present in the SOW, but are no longer in the transaction log).

  • Messages that would not be replicated to the instance being bootstrapped due to protection against replication loops.

    That is, if the AMPS-Instance-A instance is being recovered, bootstrap recovery will copy messages that were originally published to AMPS-Instance-A.

    Replication does not return messages to the instance that the message was originally received from (to prevent loops, and because that instance is already known to have the message).

Bootstrap replication initialization uses the AMPS replication transport to retrieve messages from the source. However, the bootstrap initialization process itself is distinct from replication and happens as a part of AMPS recovery, before replication begins.

Caution

Since an instance must finish bootstrap initialization before providing any other service, two instances of AMPS should not try to bootstrap from each other in a situation where both transaction logs are empty. In this case, both instances will wait for the other instance, and neither instance will recover.

If two instances of AMPS will be recovery partners with each other, start one instance without a bootstrap configuration the first time, then add the bootstrap configuration on a later restart.

Bootstrap initialization only takes place when the AMPS instance is starting, and the only function of bootstrap initialization is to restore the initial state of AMPS from another instance. To keep the current state of messages synchronized between instances, use AMPS replication.

When to Use Bootstrap Initialization

Bootstrap initialization is most commonly used for:

  • Disaster Recovery - In a situation where the entire state of the instance has been lost, bootstrap replication initialization can effectively recover the state of the instance, including the SOW topics and messages originally published to the instance.

  • Scale-out Scenarios - When an instance of AMPS joins an existing set of instances that maintain data in SOW topics for longer than the instances maintain the transaction log.

  • Offline Replica Scenarios - For example, bootstrap initialization might be a good option for periodically repopulating a development or reporting environment that periodically needs a full copy of the data in an instance, but does not require active message flow.

    In this case, the state for the instance is typically removed on a regular basis. The message contents are recovered using bootstrap initialization, but the upstream instance may not continue to replicate messages while the instance is running, since active message flow is not necessary.

Starting Bootstrap Initialization

An instance of AMPS will begin bootstrap initialization when:

  • The instance is configured to bootstrap the initial state.
  • The AMPS instance does not detect persistent state when it starts (or AMPS detects that it was shut down while bootstrap was underway).

An AMPS instance will only use bootstrap initialization when that instance starts with no messages in the transaction log. Bootstrap initialization will not overwrite or modify existing messages on an instance. If there are messages in the transaction log for the AMPS instance, the instance will not perform bootstrap initialization.

An AMPS instance does not use bootstrap initialization once the instance starts. To keep message state consistent between two instances of AMPS while AMPS is running, use AMPS replication.

Configuring Bootstrap Initialization

To configure bootstrap initialization, the configuration on the instance being initialized must include:

  • A replication transport (that is, a Transport of type amps-replication or amps-replication-secure
  • A transaction log, and
  • A bootstrap directive

Most often, the instance that is the source of bootstrap messages will also replicate to the instance that is being initialized. However, this is not required: the instance being bootstrapped can receive replication publishes from a completely different instance or set of instances.

The instance that will serve as the source of bootstrap initialization must define a replication transport, and must also have recorded the information that will be used for initialization (that is, the topics that will be initialized must be in the transaction log and/or SOW).

For example, the following configuration will initialize the message state of the instance from the source AMPS instance located at source.example.com or, if that server is unavailable, at the AMPS instance located at backup.example.com. In both cases, those instances must have configured a replication transport listening on port 4100.

This instance will initialize the topics important-topic and any topic that begins with the string audit-topic of message type json from the source server. If either of these topics is recorded in the SOW, the current messages from the source instance will be copied to this instance of AMPS.

<Bootstrap>

  <Topic>
    <Name>important-topic</Name>
    <MessageType>json</MessageType>
  </Topic>

  <Topic>
    <Name>^audit-topic</Name>
    <MessageType>json</MessageType>
  </Topic>

  <DataType>sow,txlog</DataType>

  <Transport>
    <InetAddr>source.example.com:4100</InetAddr>
    <InetAddr>backup.example.com:4100</InetAddr>
    <Type>amps-replication</Type>
  </Transport>

</Bootstrap>

This instance must also define a replication transport and a transaction log. If the topics specified are to be stored in the SOW, the instance must also include definitions for those topics in the configuration.

If an instance with this configuration starts, and there is no message data in the transaction log, the instance will bootstrap initialize message contents as configured above.

Replication Best Practices

For your application to work properly in an environment that uses AMPS replication, it is important to follow these best practices:

  • Every client that changes the state of AMPS must have a distinct client name - Although AMPS only enforces this requirement for an individual instance, if two clients with the same name are connected to two different instances, and both clients publish messages, delete messages, or acknowledge messages from a queue, the messages present on each instance of AMPS can become inconsistent.
  • Use replication filters with caution, especially for queue topics - Using a replication filter will create different topic contents on each instance. In addition, using a replication filter for a message queue topic (or the underlying topic for a message queue) can create different queue contents on different instances, and messages that are not replicated must be consumed from the instance where they were originally published.
  • Do not manually set client sequence numbers on published messages - The publish store classes in the AMPS client libraries manage sequence numbers for published messages to ensure that there is no message loss or duplication in a high availability environment. 60East recommends using those publish stores to manage sequence numbers rather than setting them manually. Since AMPS uses the sequence number to identify duplicate messages, setting sequence numbers manually can cause AMPS to discard messages or lead to inconsistent state across instances.
  • Default to PassThrough for every group - PassThrough for every group guarantees that each upstream instance will provide the full set of messages that it has to downstream groups. For some topologies, it is possible to reduce traffic to downstream instances by using the PassThrough configuration to avoid replicating messages that are guaranteed to arrive via another route (even in cases of network or server failure), but this should be done with caution.
  • Do not allow a publisher or queue consumer to fail over between two instances that are replicating using async acknowledgment - The async acknowledgment mode means that messages are acknowledged to a publisher before the downstream replication instance has acknowledged that it has received the message. Allowing a publisher to fail over between two instances that are using async acknowledgment runs the risk of creating inconsistent state or message loss.