7. Further Reading

While there is much more content beyond the scope of this document, here are some of the topics to learn about after reading this guide.

Event Logging

AMPS provides a rich logging framework that supports logging to many different targets including the console, syslog, and files. Every error and event message within AMPS is uniquely identified and can be filtered out or explicitly included in the logger output. The AMPS User Guide describes the AMPS logger configuration and the unique settings for each logging target.

Conflation for Topics and Subscriptions

Another challenge that faces developers working with high-volume data flows is the fact that not every consumer can keep up with the rate at which data arrives.

For example, an application may display a view of data that is updated hundreds or thousands of times a second. The update rate for some data can be faster than the UI framework can redraw the grid that holds the data. Without a strategy for managing these updates, the application can be unresponsive, show outdated data, consume a large amount of memory – or have all of those problems at the same time.

To help in this situation, AMPS provides built-in support for limiting the volume of updates. This feature is called conflation.

With AMPS conflation, an application receives updates for a particular message at most once within a specified interval. When an update for a record is sent to the application, the update contains the most current state of the record at that time. The application always receives the most current data. However, no matter how many times the record is updated during the interval, the application only receives the most current update at the end of the interval.

AMPS provides two forms of conflation:

  • Conflated topics are declared in the server configuration. The AMPS server keeps only a single copy of the current message state for all subscribers.
  • Conflated subscriptions are requested by an individual subscriber. AMPS keeps a copy of the current message state for each subscription, and that copy of the message state is not shared between subscriptions.

As an example of the value of conflation, imagine a SOW topic called PRICING that contains the current price for a set of instruments, and imagine that updates to the pricing are being published to the topic in real time. Several applications subscribe to this topic to display the latest prices for a subset of the instruments in a GUI front-end.

If this GUI front-end only needs updates in two second intervals from the PRICING topic, then more frequent updates would be wasteful of network and client-side processing resources. Likewise, if the GUI front end attempted to process and display every update to the prices, the incoming volume of updates might well outpace the ability of the grid to update. Using conflation in this case can both reduce network traffic and ease the load on the application.

In this case, every instance of the application is likely to have the same performance characteristics and benefit from the same interval for conflation. Therefore, configuring a conflated topic for the server would be a good approach. If there were only a single instance of this application or the application ran intermittently (for example, a monitoring or diagnostic tool), using a conflated subscription might be more appropriate.

The User Guide provides more info on conflation, conflated topics, and conflated subscriptions.

View Topics and Aggregation

AMPS contains a high-performance aggregation engine, which can be used to project one topic onto another, similar to the CREATE VIEW functionality found in most RDBMS software. Views can JOIN multiple topics together, including topics with different message types.

In addition to views configured by an administrator, individual subscriptions can create ad hoc aggregates and views on demand.

Paginated Subscriptions

For some use cases, in particular interactive applications that display large sets of records, it’s useful to be able to display a subset of all of the records of interest. This saves network bandwidth by only delivering records that the application intends to display to a user, and saves CPU time in the application by removing the requirement for the application to process and discard records that aren’t in the current result set.

For example, a web application may potentially show thousands of orders, but may only need to render a page of 20 records at any given time. With a paginated subscription, the application can request exactly the records it needs to render, and can be notified when those records change, are deleted, or if another record is inserted within the page.

Historical SOW Query

AMPS allows you to configure a SOW topic to retain the historical state of the SOW, on a configurable granularity.You can then query for the state of the SOW at a point in time, and retrieve results from the saved state.

Utilities

AMPS provides several utilities that are not essential to message processing, but can be helpful in troubleshooting or tuning an AMPS instance. The User Guide and Utility Reference describe these utilities in detail. The utilities include:

  • spark - a command-line client, which is a useful tool for diagnostics, such as checking the contents of a SOW topic. The spark client can also be used for simple scripting to run queries, place subscriptions and publish data.
  • ampserr - used to expand and examine error messages that may be observed in the logs. This utility allows a user to input a specific error code, or a class of error codes, examine the error message in more detail, and where applicable, view known solutions to similar issues.
  • amps-grep - used to search the AMPS errors and events log or AMPS journal files to quickly locate items of interest. The AMPS User Guide includes information on the utility, including command-line templates for common searches.
  • amps_sow_dump - used to inspect the contents of a SOW topic store.
  • amps_journal_dump - used to examine the contents of an AMPS journal file during debugging and program tuning.

More information about each of these utilities, including usage and examples, can be found in the AMPS Utilities User Guide.

Monitoring Interface

AMPS provides a monitoring interface which contains information about the state of the host system (CPU, memory, disk and network) as well as statistics about the state of the AMPS instance it is monitoring (clients, SOW state, Journal state and more). AMPS provides this information through a RESTful interface for ease of integration into existing enterprise monitoring systems.

AMPS can also record statistics in a persistent SQLite database, which can be queried using the standard SQLite toolset.

More information about the monitoring system provided in AMPS can be found in the AMPS Monitoring Guide. This guide also contains information about how the monitoring statistics are recorded in the statistics database.

High Availability

The High Availability chapter in the AMPS User Guide explains the powerful high availability features that AMPS provides. This chapter describes how to use the AMPS transaction log and AMPS replication to provide failover strategies and high availability guarantees.

To provide high availability and failover, AMPS provides replication of topics between instances. A set of features in the AMPS clients work with the AMPS server to provide reliable publishing and resumable subscriptions.

The transaction log, described earlier, is the foundation of AMPS replication. AMPS replication is designed to ensure that the messages in the transaction log of one AMPS instance are also in the transaction log of another AMPS instance.

The AMPS client libraries provide optional reliable publication functionality, using a local store to retain messages, until the AMPS server notifies the publisher that the message has met the persistence guarantees that the server is configured for. Typically, the persistence guarantee is configured to be the point at which the message has been confirmed to have been written to both instances in a high availability pair, but stronger guarantees (such as also having been written to an offsite disaster recovery instance, or having been written to an instance in another region) can also be configured.

The AMPS approach to high availability is based around the principle that each topic is a stream of messages. The basic concepts behind AMPS replication include:

  • AMPS replication is always treated as a connection from a source instance, which pushes messages to a destination instance, which receives messages. For two-way replication, replication is configured in both directions.
  • The goal of AMPS replication is to ensure that every replicated message that the source instance is responsible for replicating has been recorded in the transaction log of the destination instance.
  • AMPS replicates sequences of commands (that is, each individual publish) rather than the cumulative state of a set of publishes.
  • For a given data source (that is, publisher), AMPS must preserve the order in which that data source provided messages. The order must be consistent both within an instance and between instances.

For full details on AMPS replication, including recommendations and best practice advice, see the AMPS User Guide.