Tips on Measuring Performance

One of the most common questions during evaluation of AMPS is how best to measure and quantify the overall performance of the application that uses AMPS.

There are several factors that are included in any meaningful discussion of performance:

Performance of the Underlying Hardware

AMPS is designed to use the underlying hardware as efficiently as possible, and does not suffer from artificial bottlenecks that limit performance.

The implications of this, though, are that the performance available from an installation of AMPS depends on the capacity of the underlying hardware.

In particular, pay attention to:

  • Storage device speed and bandwidth (for applications that persist data)
  • Memory speed
  • Network speed and capacity

Often, you can come up with the theoretical maximum performance of a system based on the underlying hardware. For example, storage that can only write 80MB/s would be unsuitable for a system that needs to retain messages that arrive at a sustained rate of 100MB/s.

Likewise, a system with 64GB of memory would see reduced performance for lookups on a 128GB data set, so benchmarking an application that retains 128GB of active data on a system with 64GB of memory will produce very different results than the same benchmark run on a system with 256GB of memroy.

Operating System Performance

Most Linux distributions and installations are, by default, tuned for interactive desktop usage. This is convenient when developing applications, but can produce reduced performance as compared with a well-tuned server.

Linux Operating System Configuration in the AMPS User Guide discusses the Linux settings that are most often configured in a way that limits the performance of AMPS on a host. Before taking final benchmarks, tune the Linux host according to those guidelines.

Realistic Data Complexity and Volumes

AMPS is designed for high-throughput, low latency messaging. This means that AMPS typically performs better with a realistic workload than with a very small number of messages. It is typically not useful to run a performance test with a small number of messages and then attempt to extrapolate the performance at scale.

As an example, imagine a test that deploys a Docker container from scratch, starts AMPS, sends and receives a single message, and then shuts down the container and uses the elapsed time from the start of the test to the time that the container shuts down as the “single message throughput time”. That number will be orders of magnitude slower than the actual time that it takes for AMPS to deliver the message: most of the time in the test is consumed by overhead unrelated to delivering an individual mesage.

Although it would be unlikely that anyone would create a test with as much overhead as the scenario above, it is not uncommon to have hidden overhead in a test. A realistic test should avoid measuring overhead that would not be present in a production environment. If the requirement of the application is to have latency within a certain threshold when AMPS is processing messages at a sustained rate from a dozen publishers, the results of a test will be more accurate the more closely the test approximates that scenario.

In particular, as much as possible, build your tests to:

  • Have similar use of connections as the production application. If a given application will have multiple subscribers in production, do not use a single subscriber in performance testing and assume that parallel processing offers no benefit.
  • Have similar message volumes as the production application. Do not assume that you can use a rate of 100 messages per second to predict latency or processing time of an application that will need to process 1000 or 10000 messages per second.
  • Have similar message sizes as the production application. Do not assume that a 1MB in test will have the same performance characteristics as a 250KB message (or a 5MB message) in production.

Compare Equivalent Work

When benchmarking different implementation ideas, compare equivalent work. In some cases, having the AMPS server do additional work does not add noticeable latency due to the efficiencies (and parallel processing) in AMPS. In other cases, having the server do additional work may add more latency. In either case, accurately measuring throughput and latency must measure the cost of doing equivalent work in the application.

For example, if your application will use AMPS delta subscriptions (that is, have AMPS automatically calculate the differences between an update to a message and the current state of the message), rather than comparing throughput for a subscription that uses that option to a subscription that does not use that option based solely on when messages arrive at the client, compare the differences between having AMPS calculate the difference versus having the application calculate the difference, and evaluate this difference based on the total throughput numbers for a realistic number of subscribers.

Use AMPS Capabilities

AMPS is carefully designed to include functionality that reduces end-to-end throughput in the system, and to provide server-side capability where performing those functions on the server improves overall performance.

When evaluating performance, take advantage of those capabilities to get an accurate measure of how an application would perform in a production environment.

For example, if your application needs to append a calculated field to every published message, use message enrichment (or the AMPS delta publish functionality) rather than a process that extracts, rewrites, and updates the full message. Likewise, if your application will only process a subset of messages to the topic, use AMPS content filtering to ensure that AMPS only provides actionable messages rather than oversubscribing and discarding messages in your application.

If you have questions on whether your application is using the built-in capabilities of AMPS in the most effective way possible, contact 60East support for an engineer to review your design.