6. Message Queues

AMPS includes high performance queuing built on the AMPS messaging engine and the transaction log. AMPS queues combine elements of classic message queuing with the advanced messaging features of AMPS, including content filtering, aggregation and projection, and so on.

AMPS queues help you easily solve some common messaging problems:

  • Ensuring that a message is only processed once.
  • Distributing tasks across workers in a fair manner.
  • Ensuring that a message that has been delivered is processed.
  • Ensuring that when a worker fails to process a message, that message is re-delivered.

These uses of messaging require different behavior than the scenarios discussed in AMPS Basics: Topics, Publish and Subscribe. For basic publish and subscribe, each message is delivered to any number of subscribers. With queues, each message is fully processed by only one subscriber.

While it’s possible to create applications with these properties by using the other features of AMPS, message queues provide these functions built into the AMPS server for additional performance, simple administration, and ease of development.

AMPS queues also allow you to:

  • Replicate messages between AMPS instances while preserving delivery guarantees.
  • Create views and aggregates based on the current contents of a queue.
  • Filter messages with specific content into specific queues.
  • Provide a subscriber only messages that contain specific content.
  • Provide a single published message to multiple queues.
  • Aggregate multiple topics into a single queue.
  • Provide content aware entitlement for security.
  • Provide prioritization of messages within a queue, so higher-priority messages are processed first.
  • Provide a synchronization point that guarantees that all messages prior to that point have been processed before messages after that point are delivered.

How Do Queues Work?

When an application needs to receive messages, there is little difference between subscribing to a queue and subscribing to a pub/sub topic. Both delivery models use the subscribe command, and both delivery models can provide a filter to specify messages of interest. Both types of topics provide the same message objects in the AMPS Client interfaces.

Once a message is received from a queue, however, the application must let AMPS know when the message is successfully processed. This acknowledgment lets AMPS know that the application is finished with the message, and has capacity to receive another message. In addition, if the queue is configured to retry messages if an application fails to process the message (at-least-once delivery), acknowledging the message indicates to AMPS that the message has been processed successfully and can be removed from the queue.

Within AMPS, the server maintains an in-memory list of all of the messages currently available for delivery in a given queue and a list of all of the messages awaiting acknowledgment from subscribers. The messages themselves are stored in the transaction log for the instance. When a message has been successfully processed, the acknowledgment for that message is also stored in the transaction log.

There is no separate storage required for a queue, since messages are recorded in the transaction log. Likewise, even when a message is removed from the queue, AMPS maintains a persisted record of that message and the acknowledgment in the transaction log. Given that the transaction log contains a full record of messages and acknowledgments, AMPS queues are persistent across server restarts, and can be replicated to other instances. (For details on replicating queues, see the AMPS User Guide.)

Keeping the delivery state – that is, the queue itself – independent of the topic in the transaction log has several other advantages. Since the set of messages in the queue is maintained separately from the physical storage for those messages, a queue in AMPS can hold messages from any number of underlying topics. Content filtering can be applied to the queue to selectively add messages to the queue: in fact, the same topic can easily be split into independent queues using content filtering. Messages to a single topic can also be included in multiple, independent queues (for example, one queue for immediate processing, and another queue for end-of-day auditing and reconciliation).

AMPS includes the ability for a given consumer to declare the capacity of that consumer, using the max_backlog option on a queue subscription. This option declares the number of messages that the consumer is willing to have delivered at a given time. Using this option can improve throughput, since AMPS can ensure that a consumer is never idle waiting for a new message. This also helps AMPS to balance message delivery across consumers in the most efficient manner, as measured by the current available capacity of each consumer. For example, a consumer on a small VM might take 200 milliseconds, on average, to process a message, and might declare a max_backlog of 2. A consumer running on a larger physical server, in contrast, might take 50 ms on average to process a message, and might therefore declare a max_backlog of 8 or more. The maximum allowed backlog for a subscriber is configured for each queue, so that queues that hold large units of work can set a smaller maximum value than queues that provide smaller tasks.

When Should I Use Queues?

Queues are intended to guarantee delivery of each message to a single consumer that processes the messages. Use queues when the problem you are solving requires a message to be processed once. When you need to distribute messages to a large number of consumers, use the AMPS pub/sub delivery model.

For example, a queue is a natural fit for a system that allocates work, such as a system that runs software builds or that executes financial transactions. A system that provides notifications to a large number of systems (for example, a system that distributes bids to sellers or a system that communicates status to a user interface) is a more natural fit for the pub/sub delivery model.

Queues are often used to solve problems like:

  • Guaranteeing that a given set of work is distributed fairly across a set of workers, while each unit of work is only performed once:

    A system that performs CPU-intensive calculations needs to ensure that any time a request comes into the system, it is serviced by the next available worker.

    A distributed compute grid has workers that vary widely in capacity. Each worker declares its capacity to AMPS. Workers with more capacity free receive work before workers with less free capacity, improving overall throughput for the compute grid.

  • Guaranteeing that a specific message is fully processed once, regardless of the number of subscribers:

    A system that processes refunds enters the refund orders into a queue. Each message is delivered to one, and only one, worker. If the worker successfully processes the message, the worker removes the message from the queue. If the worker fails, AMPS automatically delivers the message to another worker, ensuring the message is processed.

The AMPS client libraries, starting in version 5.0, are queue-aware and contain features to make it easier to work with queues and create the application behavior that you need. See the Developer Guide for the client library of your choice for details on how to use these features.

For further details on message queues and how they function, the chapter on Message Queues in the AMPS User Guide presents a more complete discussion.

Configuration

As described above, both the topic that holds the messages for the queue and the queue topic itself must be recorded in the AMPS transaction log.

In addition, the queue itself must be declared in the SOW element of the AMPS configuration.

For example, the configuration below records the topics Work and WorkToDo in the AMPS transaction log:

<TransactionLog>
  <JournalDirectory>/fast-storage/journals</JournalDirectory>
  <Topic>
      <Name>Work</Name>
      <MessageType>json</MessageType>
  </Topic>
  <Topic>
      <Name>WorkToDo</Name>
      <MessageType>json</MessageType>
  </Topic>
</TransactionLog>

With these topics added to the transaction log, we can configure a WorkToDo queue that provides queuing for the messages in the Work topic.

<SOW>
  <Queue>

     <Name>WorkToDo</Name>
     <MessageType>json</MessageType>
     <Semantics>at-least-once</Semantics>
     <UnderlyingTopic>Work</UnderlyingTopic>

     <!-- recommended: set a MaxPerSubscriptionBacklog
          to allow more efficient delivery -->

     <MaxPerSubscriptionBacklog>10</MaxPerSubscriptionBacklog>


     <!-- recommended: set lifetime limit
          and delivery limits -->

     <Expiration>12h</Expiration>
     <MaxDeliveries>5</MaxDeliveries>
     <LeasePeriod>10m</LeasePeriod>

   </Queue>
</SOW>

This declares a queue named WorkToDo that provides queuing for the messages in the Work topic. By setting the semantics to at-least-once, the topic is configured to redeliver a queue message to another subscriber in the event that a subscriber fails to process that message successfully.

This example also provides an example of recommended configuration options to help manage message lifetime in cases where a message cannot be processed successfully, or where no consumers are available to process the message.

For this queue, we provide the following options:

  • Expiration specifies that a message will be available for, at most, 12 hours from the time it enters the queue.
  • MaxDeliveries specifies that a given message can be delivered from the queue at most 5 times: after that, the message will be considered to be unable to be processed and removed from the queue.
  • LeasePeriod specifies that a consumer has 10 minutes from the time the message is sent to acknowledge the message or the message will be automatically returned to the queue.

AMPS also provides a mechanism for publishing expired messages to a dead-letter queue, as well as a wide variety of options for controlling delivery.

Further Reading

See Message Queues in the AMPS User Guide for a more complete discussion of message queues, including discussions of advanced features, replicated queues, and so on.

The AMPS client libraries include samples for working with message queues. See the client library distribution for those samples. (Notice that some clients are distributed as pre-built binaries. For those clients, the full distribution contains the samples.)