13. Aggregating and Analyzing Data in AMPS

AMPS contains a high-performance aggregation engine, which can be used to project one SOW topic onto another, similar to the CREATE VIEW functionality found in most RDBMS software. The aggregation engine can join input from multiple topics, of the same or different message types, and can produce output in different message types.

View topics are part of the AMPS State of the World, which means that views support delta subscriptions and out of focus (OOF) tracking. A view can also be used as the underlying topic for another view.

In addition, for the limited cases where a view is not practical, AMPS allows an individual subscription to request aggregation and projection a single SOW topic.

Notice that the features described in this chapter are designed for cases where an application needs to aggregate data across messages or to perform a calculation on an individual message that should not be preserved as a part of that message.

To modify a message as it is published to AMPS, use preprocessing or enrichment. To simply retrieve a subset of the fields in a message, use select lists.

Understanding Views

Views allow you to aggregate messages from one or more SOW topics in AMPS and present the aggregation as a new SOW topic. AMPS stores the contents of the view in a user-configured file, similar to a materialized view in RDBMS software.

Views are often used to simplify subscriber implementation and can reduce the network traffic to subscribers. For example, if some clients will only process orders where the total cost of the order exceeds a certain value, you can both simplify subscriber code and reduce network traffic by creating a view that contains a calculated field for the total cost. Rather than receiving all messages and calculating the cost, subscribers can filter on the calculated field. You can also combine information from multiple topics. For example, you could create a view that contains orders from high-priority customers that exceed a certain dollar amount.

AMPS sends messages to view topics the same way that AMPS sends messages to SOW topics: when a message arrives that updates the value of a message in the view, AMPS sends a message on the view topic. Likewise, you can query a view the same way that you query a SOW topic.

Defining a view is straightforward. You set the name of the view, the SOW topic or topics from which messages originate and describe how you want to aggregate, or project, the messages. AMPS creates a topic and projects the messages as requested.

caution All message types that you specify in a view must support view creation. The AMPS default message types all support views.

Because AMPS uses the SOW topics of the underlying messages to determine when to update the view, the underlying topics used in a view must have a SOW configured. In addition, the topics must be defined in the AMPS configuration file before the view is defined.

AMPS updates each view after a publish or delta publish to a message in an underlying topic. Updates are processed for each view in the order in which AMPS processed the updates to the underlying topic. AMPS processes these updates asynchronously, after each SOW update is persisted. For additional performance, AMPS provides the ability to conflate updates to views that process high velocity updates, as described in Inline Update Conflation.

Defining Views and Aggregations

Multiple topic aggregation creates a view using more than one topic as a data source. This allows you to enrich messages as they are processed by AMPS, to do aggregate calculations using information published to more than one topic. You can combine messages from multiple topics and use filtered subscriptions to determine which messages are of interest. For example, you can set up a topic that contains orders from high-priority customers.

You can join topics of different message types, and you can project messages of a different type than the underlying topic.

To create an aggregate using multiple topics, each topic needs to maintain a SOW. Since views maintain an underlying SOW, you can create views from views.

To define an aggregate, you decide:

  • The topic, or topics, that contain the source for the aggregation
  • If the aggregation uses more than one topic, how those topics relate to each other
  • What messages to publish, or project, from the aggregation
  • How to group messages for aggregation
  • The message type of the aggregation

Message types provided with AMPS fully support views, with the following exceptions:

  • binary message types cannot be an underlying topic for a view or the type of a view
  • protobuf message types can be the underlying topic for a view, but cannot be the type of a view
  • composite-global message types can be the underlying topic for the view, but cannot be the type of the view

If you are using a custom message type, check with the message type developer as to whether that message type supports aggregation.

Single Topic Aggregation: UnderlyingTopic

For aggregations based on a single topic, use the UnderlyingTopic element to tell AMPS which topic to use. All messages from the UnderlyingTopic will appear in the aggregation.

<UnderlyingTopic>MyOriginalTopic</UnderlyingTopic>

Multiple Topic Aggregation: Join

Join definitions tell AMPS how to relate underlying topics to each other. You use a separate Join element for each relationship in the view. Most often,the join definition describes a relationship between topics:

[topic].[field]=[topic].[field]

The topics specified must be previously defined in the AMPS configuration file. The square brackets [] are optional. If they are omitted, AMPS uses the first / in the expression as the start of the field definition. You can use any number of join expressions to define a multiple topic aggregation.

A Join definition is an equality comparison between the values of two fields. The Join definition is not evaluated as an AMPS expression, so functions, operators (other than =) and so forth are not evaluated in these definitions.

Within a Join definition, values are always compared as strings. This means that values such as 12345, 12345.00, and 1.2345E+04 can be considered to be different values by the Join expression since these are different strings, even though these strings contain the same numeric value.

If your aggregation will join messages of different types, or produce messages of a different type than the underlying topics, you add message type specifiers to the join definition:

[messagetype].[topic].[field]=[messagetype].[topic].[field]

In this case, the square brackets [] around the messagetype are mandatory. AMPS creates a projection in the aggregation that combines the messages from each topic where the expression is true. In other words, for the expression:

<Join>[Orders].[/CustomerID]=[Addresses].[/CustomerID]</Join>

AMPS projects every message where the same CustomerID appears in both the Addresses topic and the Orders topic. If a CustomerID value appears in only the Addresses topic, AMPS does not create a projection for the message. If a CustomerID value appears in only the Orders topic, AMPS projects the message with NULL values for the Addresses topic. In database terms, this is equivalent to a LEFT OUTER JOIN.

You can use any number of Join definitions in an underlying topic:

<Join>[nvfix].[Orders].[/CustomerID]=[json].[Addresses].[/CustomerID]</Join>
<Join>[nvfix].[Orders].[/ItemID]=[nvfix].[Catalog].[/ItemID]</Join>

In this case, AMPS creates a projection that combines messages from the Orders, Addresses, and Catalog topics for any published message where matching messages are present in all three topics. Where there are no matching messages in the Catalog and Addresses topics, AMPS projects those values as NULL.

caution A Join element can also contain only one topic. In this case, all messages from that topic are included in the view.

Setting the Message Type

The MessageType element of the definition sets the type of the outgoing messages. The message type of the aggregation does not need to be the same as the message type of the topics used to create the aggregation. However, if the MessageType differs from the type of the topics used to produce the aggregation, you must explicitly specify the message type of the underlying topics.

For example, to produce JSON messages regardless of the types of the topics in the aggregation, you would use the following element:

<MessageType>json</MessageType>

Defining Projections

AMPS makes available all fields from matching messages in the join specification. You specify the fields that you want AMPS to project and how to project them.

To tell AMPS how to project a message, you specify each field to include in the projection. The specification provides a name for the projected field and one or more source field to use for the projected field. The data can be projected as-is, or aggregated using one of the AMPS aggregation functions, as described in Chapter 4 Aggregate Functions.

You refer to source fields using the XPath-like expression for the field. You name projected fields by creating an XPath-like expression for the new field. AMPS uses this expression to name the new field.

<Projection>
    <Field>[Orders].[/CustomerID]</Field>
    <Field>[Addresses].[/ShippingAddress] AS /DestinationAddress</Field>
    <Field>SUM([Orders].[/TotalPrice]) AS /AccountTotal</Field>
</Projection>

The sample above uses the CustomerID from the orders topic and the shipping address for that customer from the Addresses topic. The sample calculates the sum of all of the orders for that customer as the AccountTotal. The sample also renames the ShippingAddress field as DestinationAddress in the projected message.

For more information on constructing fields in a view, see Chapter 4 Constructing Fields.

Data Types and Projections

When projecting views, AMPS converts the original values into the AMPS internal type system and serializes those values into a new message. This approach allows AMPS to efficiently aggregate messages of different types and produce predictable results. The data type of the serialization is determined by the message type of the projected message: the message types provided by 60East in this release project the AMPS internal type.

This means that, for message types that rely on type markers to identify the type (such as bson), the type of the field in the projected message may reflect the AMPS internal type rather than the original type. This conversion is typically a widening conversion for numeric types (for example, input typed as a 32-bit integer will typically be widened to a 64-bit integer). For other types, the most common conversion is from a specific data type (such as regular expression) to a string type.

A projection is evaluated as projecting a single value in the AMPS type system. This means that complex or nested data types are typically projected as the string equivalent. For example, a nested set of XML elements could be projected as an empty string (the text value of the containing element), or an array could be projected as the first value in the array.

For details on the AMPS data types, see the section called “AMPS Data Types”.

Grouping

Use grouping statements to tell AMPS how to aggregate data across messages and generate projected messages.

For example, an Orders topic that contains messages for incoming orders could be used to calculate aggregates for each customer, or aggregates for each symbol ordered. The grouping statement tells AMPS which way to group messages for aggregation.

<Grouping>
    <Field>[Orders].[/CustomerID]</Field>
</Grouping>

The sample above groups and aggregates the projected messages by CustomerId. Because this statement tells AMPS to group by CustomerId, AMPS projects a message for each distinct CustomerId value. A message to the Orders topic will create an outgoing message with data aggregated over the CustomerId.

Each field in the projection should either be an aggregate or be specified in the Grouping element. Otherwise, AMPS returns the last processed value for the field.

caution Unlike ANSI SQL, AMPS allows you to include fields in the projection that are not included in the Grouping or used within the aggregate functions. In this case, AMPS uses the last value processed for the value of these fields. AMPS enforces a consistent order of updates to ensure that the value of the field is consistent across recovery and restart.

Inline Update Conflation

AMPS has the ability to conflate updates to a view. Conflation is particularly useful when a view receives a high velocity of updates and subscribers to the view have no need to track every update, but instead want to see the current state of the view as quickly as possible. For applications that have a high update rate and relatively complicated view processing, inline conflation can significantly reduce the total number of updates processed for the view and increase overall throughput.

Inline conflation changes how AMPS manages pending updates for a view. Without inline conflation enabled for a view, AMPS processes all messages for a view strictly in the order in which those messages were published. Even if there are multiple updates to the same record pending, AMPS processes each of those messages in turn and updates the view for each message.

When inline conflation is enabled and message arrives with the same Grouping value as a message waiting to be processed, AMPS replaces the pending message with the new message, and only processes the new message. Inline conflation does not cause AMPS to slow down the rate at which AMPS processes updates for a view. AMPS continues to process updates for the view as fast as possible, and makes no guarantees as to the number of updates to a view produced by a given set of updates to an underlying topic.

The diagram below shows a simplified representation of inline conflation for a view where the underlying SOW uses the id field of the message as the Key. With conflation set to none (the default for a view), each message is added to the end of the messages waiting to be processed, whether or not an update for that group is already waiting. Both updates are processed. By contrast, when conflation is set to inline, if there is an existing update waiting, the new update replaces the existing update, and only the new update is processed.

../_images/inline_view_conflation.svg

Because inline conflation replaces messages while processing is pending, the following considerations apply to views that enable inline conflation:

  • Not every update to underlying topic will produce an individual update to the view: when multiple updates occur to the same record in a short period of time, AMPS may only process the last update.
  • Updates to the view may be produced in an order different than the order in which the messages were published to the underlying topic, since AMPS replaces messages waiting to be processed
  • The final state of the view will be exactly as if each update were processed, since it will be based on the latest values in the underlying topic (or topics)

To enable inline conflation, add the Conflation element to the configuration for the View, as shown below:

<SOW>
    <View>
        ...
        <Conflation>inline</Conflation>
        ...
    </View>
</SOW>

Filtering Single Topic Aggregations

When a view aggregates a single topic, you can use a Filter element in the view definition to limit the messages included in the view to only those messages that match the filter. For example, to aggregate only messages from an underlying topic where the /status is complete, you could define your view as follows:

<SOW>
    ...

    <Topic>
        <Name>orders</Name>
        <MessageType>json</MessageType>
        <Key>/orderId</Key>
        <FileName>./sow/%n.sow</FileName>
    </Topic>
    <View>
        <Name>CompleteByRegion</Name>
        <UnderlyingTopic>orders</UnderlyingTopic>
        <MessageType>json</MessageType>
        <Projection>
            <Field>COUNT(/orderId) AS /completedOrders</Field>
            <Field>/region AS /region</Field>
        </Projection>
        <Grouping>
            <Field>/region</Field>
        </Grouping>
        <Filter>/status = 'complete'</Filter>
    </View>

    ...
</SOW>

The Filter element is not supported for multiple topic aggregation.

Constructing Fields

The AMPS expression language is used to construct fields in aggregates, as described in Chapter 4 Constructing Fields.

Examples

Simple Aggregate View Example

For a potential usage scenario, imagine the topic ORDERS which includes the following NVFIX message schema:

NVFIX Tag Description
OrderID unique order identifier
Tick symbol
ClientId unique client identifier
Shares currently executed shares for the chain of orders
Price average price for the chain of orders

Table 13.1: ORDERS Table Identifiers

This topic includes information on the current state of executed orders, but may not include all the information we want updated in real-time. For example, we may want to monitor the total value of all orders executed by a client at any moment. If ORDERS was a SQL Table within an RDBMS, the “view” we would want to create would be similar to:

CREATE VIEW TOTAL_VALUE AS
SELECT ClientId, SUM(Shares * Price) AS TotalCost
FROM ORDERS
GROUP BY ClientId

As defined above, the TOTAL_VALUE view would only have two fields:

  1. ClientId: the client identifier
  2. TotalCost: the summation of current order values by client

Views in AMPS are specified in the AMPS configuration file in View sections, which are defined in the SOW section. The example above would be defined as:

<SOW>
    <Topic>
        <Name>ORDERS</Name>
        <MessageType>nvfix</MessageType>
        <Key>/OrderID</Key>
        <FileName>./sow/%n.sow</FileName>
    </Topic>
    <View>
        <Name>TOTAL_VALUE</Name>
        <UnderlyingTopic>ORDERS</UnderlyingTopic>
        <MessageType>nvfix</MessageType>
        <Projection>
            <Field>/ClientId</Field>
            <Field>SUM(/Shares * /Price) AS /TotalCost</Field>
        </Projection>
        <Grouping>
            <Field>/ClientId</Field>
        </Grouping>
    </View>
</SOW>
caution Views require an underlying SOW topic. See State of the World (SOW) for more information on creating and configuring SOW topics.

The Topic element is the name of the new topic that is being defined. This Topic value will be the topic that can be used by clients to subscribe for future updates or perform SOW queries against.

The UnderlyingTopic is the SOW topic or topics that the view operates on. That is, the UnderlyingTopic is where the view gets its data from. All XPath references within the Projection fields are references to values within this underlying SOW topic (unless they appear on the right-hand side of the AS keyword.)

The Projection section is a list of 1 or more Fields that define what the view will contain. The expressions can contain either a raw XPath value, as in “/ClientId” above, which is a straight copy of the value found in the underlying topic into the view topic using the same target XPath. If we had wanted to translate the ClientId tag into a different tag, such as CID, then we could have used the AS keyword to do the translation as in /ClientId AS /CID.

caution Unlike ANSI SQL, AMPS allows you to include fields in the projection that are not included in the Grouping or used within the aggregate functions. In this case, AMPS uses the last value processed for the value of these fields. AMPS enforces a consistent order of updates to ensure that the value of the field is consistent across recovery and restart.

caution An unexpected 0 (zero) in an aggregate field within a view usually means that the value is either zero or NaN. AMPS defaults to using 0 instead of NaN. However, any numeric aggregate function will result in a NaN if the aggregation includes a field that is not a number.

Finally, the Grouping section is a list of one or more Field‘s that define how the records in the underlying topic will be grouped to form the records in the view. In this example, we grouped by the tag holding the client identifier. However, we could have easily made this the “Symbol” tag /Tick.

In the below example, we group by the /ClientId because we want to count the number of orders for each client that have a value greater than 1,000,000:

<SOW>
    ...

    <View>
        <Name>NUMBER_OF_ORDERS_OVER_ONEMILL</Name>
        <UnderlyingTopic>ORDERS</UnderlyingTopic>
        <Projection>
            <Field>/ClientId</Field>
            <Field>SUM(IF(/Shares * /Price &gt; 1000000, /Shares * /Price, NULL)) AS /AggregateValue2</Field>
        </Projection>
        <Grouping>
            <Field>/ClientId</Field>
        </Grouping>
        <FileName>./views/numOfOrdersOverOneMil.view</FileName>
        <MessageType>nvfix</MessageType>
    </View>

    ...
</SOW>

Notice that the /AggregateValue and /AggregateValue_2 will contain the same value; however /AggregateValue was defined using an XML CDATA block, and /AggregateValue_2 was defined using the XML > entity reference.

caution Since the AMPS configuration is XML, special characters in projection expressions must either be escaped with XML entity references or wrapped in a CDATA section.

Updates to underlying topics can potentially cause many more updates to downstream views, which can create stress on downstream clients subscribed to the view. If any underlying topic has frequent updates to the same records and/or a real-time view is not required, as in a GUI, then a replica of the topic may be a good solution to reduce the frequency of the updates and conserve bandwidth. For more on topic replicas, please see Chapter 11 Conflated Topics.

Multiple Topic Aggregate Example

This example demonstrates how to create an aggregate view that uses more than one topic as a data source. For a potential usage scenario, imagine that another publisher provides a COMPANIES topic which includes the following NVFIX message schema:

NVFIX Tag Description
CompanyId unique identifier for the company
Tick symbol
Name company name

Table 13.2: COMPANIES Table Identifiers

This topic includes the name of the company, and an identifier used for internal record keeping in the trading system. Using this information, we want to provide a running total of orders for that company, including the company name.

If ORDERS and COMPANIES were a SQL Table within an RDBMS, the “view” we would want to create would be similar to:

CREATE VIEW TOTAL_COMPANY_VOLUME AS
SELECT COMPANIES.CompanyId, COMPANIES.Tick, COMPANIES.Name, SUM(ORDERS.Shares) AS TotalVolume
FROM COMPANIES LEFT OUTER JOIN ORDERS
    ON COMPANIES.Tick = ORDERS.Tick
GROUP BY ORDERS.Tick

As defined above, the TOTAL_COMPANY_VOLUME table would have four columns:

  1. CompanyId: the identifier for the company
  2. Tick: The ticker symbol for the company
  3. Name: The name of the company
  4. TotalVolume: The total number of shares involved in orders

To create this view, use the following definition in the AMPS configuration file:

<SOW>
    <Topic>
        <Name>ORDERS</Name>
        <MessageType>nvfix</MessageType>
        <Key>/OrderID</Key>
        <FileName>./sow/%n.sow</FileName>
    </Topic>
    <Topic>
        <Name>COMPANIES</Name>
        <MessageType>nvfix</MessageType>
        <Key>/CompanyId</Key>
        <FileName>./sow/%n.sow</FileName>
    </Topic>
    <View>
        <Name>TOTAL_COMPANY_VOLUME</Name>
        <UnderlyingTopic>
            <Join>[ORDERS]./Tick = [COMPANIES]./Tick</Join>
        </UnderlyingTopic>
        <FileName>./views/totalVolume.view</FileName>
        <MessageType>nvfix</MessageType>
        <Projection>
            <Field>[COMPANIES]./CompanyId</Field>
            <Field>[COMPANIES]./Tick</Field>
            <Field>[COMPANIES]./Name</Field>
            <Field>SUM([ORDERS]./Shares) AS /TotalVolume</Field>
        </Projection>
        <Grouping>
            <Field>[ORDERS]./Tick</Field>
        </Grouping>
    </View>
</SOW>

As with the single topic example, first specify the underlying topics and ensure that they maintain a SOW database. Next, the view defines the underlying topic that is the source of the data. In this case, the underlying topic is a join between two topics in the instance. The definition next declares the file name where the view will be saved, and the message type of the projected messages. The message types that you join can be different types, and the projected messages can be a different type than the underlying message types. The projection uses three fields from the COMPANIES topic and one field that is aggregated from messages in the ORDERS topic. The projection groups results by the Tick symbols that appear in messages in the ORDERS topic.

View Projected Into Different Message Type

This example shows how to project an underlying topic of one message type into a topic of a different message type.

There is very little difference between this example and the single topic view in Simple Aggregate View Example. The main difference is that, because the destination view has a different message type than the underlying topic, every reference to a field from the underlying topic must be fully-qualified with the message type.

As before, imagine the topic ORDERS which includes the following NVFIX message schema:

NVFIX Tag Description
OrderID unique order identifier
Tick symbol
ClientId unique client identifier
Shares currently executed shares for the chain of orders
Price average price for the chain of orders

Table 13.3: ORDERS Table Identifiers

As before, we want to project the summation of current order values by client. The TOTAL_VALUE view will have two fields:

  1. ClientId: the client identifier
  2. TotalCost: the summation of current order values by client

However, in this case, we want to project the summary into a JSON document. To do this we simply specify that the final view will be in JSON format, and fully qualify all references to the underlying topic in the view definition.

The example above would be defined as:

<SOW>
    <Topic>
        <Name>ORDERS</Name>
        <MessageType>nvfix</MessageType>
        <Key>/OrderID</Key>
        <FileName>./sow/%n.sow</FileName>
    </Topic>
    <View>
        <Name>TOTAL_VALUE</Name>
        <UnderlyingTopic>[nvfix].[ORDERS]</UnderlyingTopic>
        <MessageType>json</MessageType>
        <Projection>
            <Field>[nvfix].[ORDERS]./ClientId AS /ClientId</Field>
            <Field>SUM([nvfix].[ORDERS]./Shares * [nvfix].[ORDERS]./Price) AS /TotalCost</Field>
        </Projection>
        <Grouping>
            <Field>[nvfix].[ORDERS]./ClientId</Field>
        </Grouping>
    </View>
</SOW>

This example uses an underlying topic in NVFIX format, computes an aggregation by ClientId, and then produces output in JSON format.

Aggregated Subscriptions

In addition to precomputed views and aggregates, AMPS provides the ability for the server to compute an aggregation for an individual subscription. When an application requests an aggregated subscription, rather than providing messages for the subscription verbatim, the AMPS server will calculate the requested aggregates and produce a message that contains the aggregated data.

Most of the time, AMPS applications use views to provide aggregation, as described in section Understanding Views. AMPS views are shared across subscriptions, and are calculated once, when a message updates a view, regardless of the number of subscribers that subscribe to the view. AMPS provides aggregated subscriptions as a way to do ad hoc aggregation in cases where a specific aggregate is only needed for a short period time, will only be used by a single subscriber, or must be provided before the server can be restarted with a defined view. If the aggregation is frequently used, or if multiple subscribers will use the aggregation, consider using a view rather than an aggregated subscription.

To request an aggregated subscription, the subscriber provides a definition of the fields to project and the grouping to apply with each subscription. AMPS performs the aggregation and constructs the specified message before delivering the message.

For example, imagine a topic in the SOW that uses the /id field to create the SOW key. The topic contains the following messages:

{ "id":1, "tickerId" : "IBM", "price" : 150.34 }
{ "id":2, "tickerId" : "IBM", "price" : 149.76 }
{ "id":3, "tickerId" : "IBM", "price" : 149.32 }
{ "id":4, "tickerId" : "IBM", "price" : 151.10 }

A subscriber enters a SOW query with the following options:

projection=[MAX(/price) AS /max,/tickerId as /ticker],grouping=[/tickerId]

AMPS aggregates the messages in the SOW, and delivers the following projected record:

{ "ticker" : "IBM", "max" : 151.10 }

Aggregated subscriptions are supported for commands that use the State of the World: sow, sow_and_subscribe, and sow_and_delta_subscribe. However, there are limitations on some variants of the commands, as described in the following sections.

The memory consumed to maintain an aggregated subscription is counted as part of the total memory for the client that submitted the subscription when considering the MessageMemoryLimit for that client.

When to Use Aggregated Subscriptions

Aggregated subscriptions require AMPS to compute the aggregate for each subscription individually, at the time that messages are processed for the subscription. In addition, for aggregated subscriptions, the current state of the aggregation is retained for each subscription.

In cases where more than one subscriber is using the same aggregation, a View is more efficient: each record in the view is only computed once, saving CPU cycles, and ongoing updates for the record are only stored once, requiring less memory. Likewise, if the aggregation uses more than one topic or aggregates messages of a different type than the final result, you must configurea View on the server.

An aggregated subscription is most appropriate if one or more of the following is true:

  • A subscription has unique and unpredictable aggregation needs. For example, if no other subscription is computing a given aggregation, and it is not possible to predict in advance the aggregates to compute, then per-subscription aggregation is a good solution.
  • The application is under development and iterating quickly. It can be convenient to use aggregated subscriptions while developing aggregate definitions that will be eventually provided as view topics.
  • The aggregation is expensive and seldom needed. For example, if an aggregation is memory-intensive and only needed once a week at a time when the instance is otherwise lightly-used, the overall memory usage of the AMPS instance may be reduced during the rest of the week by using an aggregated subscription.

The considerations above are general guidance to help you consider options between per-subscription aggregation and a persistent view. In general, if it is possible to use an AMPS view for a given aggregation task and that view will be frequently used, a view is often the best option. If a view cannot be used (because the aggregation is not known in advance) or the view would seldom be used, an aggregated subscription may be a better option.

Requesting an Aggregated Subscription

To request an aggregated subscription, set the following options on the subscription:

Option Description
projection=[field specifications]

Specifies a comma-delimited set of fields to project, within brackets. Each entry has the format described in Constructing Fields.

This option must contain an entry for every field in the aggregated message. If there is no entry for a field in this option, that field will not appear in the aggregated message, even if the field is in the underlying message.

For example, to project the total value of orders for a specific item, you might take the sum of the /price multiplied by the /quantity for each item, along with the original /description, as follows:

projection=[SUM(/price * /quantity) AS /total, /description]

When a field appears in the projection option, but is not part of a grouping clause or used in an aggregation function, the message will have the value of that field in the last message processed by AMPS.

There is no default for this option. When this option is provided, a grouping must also be provided.

grouping=[keys]

For an aggregated subscription, the format of this option is a comma-delimited list of XPath identifiers within brackets. For example, to aggregate entries based on their /description (producing one record in the aggregation for each distinct value in /description), you would use the following option:

grouping=[/description]

There is no default for this option. When this option is provided, a projection must also be provided.

Table 12.4: Aggregated Subscription options

For example, to request a count, by customer, of the order records stored in a topic in the SOW, you could use the following options:

projection=[COUNT(/orderId)AS /orderCount, /customer AS /customer],grouping=[/customer]

Considerations for Aggregated Subscriptions

When planning to use an aggregated subscription, the following considerations apply:

  • The topic for the subscription must be a topic in the State of the World. This includes views and the SOW view of a queue.
  • The topic for the subscription must not be a regular expression.
  • When subscribing to a queue, an aggregated subscription does not remove messages from the queue. Like a view definition with the queue as an underlying typic, an aggregated subscription browses the queue without taking messages from the queue.
  • Filters for the subscription apply to the original messages, not the results of the projection. A filter for an aggregated subscription is equivalent to the Filter element in a View definition rather than a filter for a subscription that uses the view.
  • A subscription that uses per-subscription aggregation does not support the replace option except for changing pagination options.
  • An aggregated subscription cannot be a bookmark subscription. That is, replay from the transaction log does not support aggregated subscriptions.