13. Aggregating and Analyzing Data in AMPS¶
AMPS contains a high-performance aggregation engine, which can be used
to project one SOW topic onto another, similar to the CREATE VIEW
functionality found in most RDBMS software. The aggregation engine can
join input from multiple topics, of the same or different message types,
and can produce output in different message types.
View topics are part of the AMPS State of the World, which means that views support delta subscriptions and out of focus (OOF) tracking. A view can also be used as the underlying topic for another view.
In addition, for the limited cases where a view is not practical, AMPS allows an individual subscription to request aggregation and projection a single SOW topic.
Notice that the features described in this chapter are designed for cases where an application needs to aggregate data across messages or to perform a calculation on an individual message that should not be preserved as a part of that message.
To modify a message as it is published to AMPS, use preprocessing or enrichment. To simply retrieve a subset of the fields in a message, use select lists.
Understanding Views¶
Views allow you to aggregate messages from one or more SOW topics in AMPS and present the aggregation as a new SOW topic. AMPS stores the contents of the view in a user-configured file, similar to a materialized view in RDBMS software.
Views are often used to simplify subscriber implementation and can reduce the network traffic to subscribers. For example, if some clients will only process orders where the total cost of the order exceeds a certain value, you can both simplify subscriber code and reduce network traffic by creating a view that contains a calculated field for the total cost. Rather than receiving all messages and calculating the cost, subscribers can filter on the calculated field. You can also combine information from multiple topics. For example, you could create a view that contains orders from high-priority customers that exceed a certain dollar amount.
AMPS sends messages to view topics the same way that AMPS sends messages to SOW topics: when a message arrives that updates the value of a message in the view, AMPS sends a message on the view topic. Likewise, you can query a view the same way that you query a SOW topic.
Defining a view is straightforward. You set the name of the view, the SOW topic or topics from which messages originate and describe how you want to aggregate, or project, the messages. AMPS creates a topic and projects the messages as requested.
All message types that you specify in a view must support view creation. The AMPS default message types all support views. |
Because AMPS uses the SOW topics of the underlying messages to determine when to update the view, the underlying topics used in a view must have a SOW configured. In addition, the topics must be defined in the AMPS configuration file before the view is defined.
AMPS updates each view after a publish or delta publish to a message in an underlying topic. Updates are processed for each view in the order in which AMPS processed the updates to the underlying topic. AMPS processes these updates asynchronously, after each SOW update is persisted. For additional performance, AMPS provides the ability to conflate updates to views that process high velocity updates, as described in Inline Update Conflation.
Defining Views and Aggregations¶
Multiple topic aggregation creates a view using more than one topic as a data source. This allows you to enrich messages as they are processed by AMPS, to do aggregate calculations using information published to more than one topic. You can combine messages from multiple topics and use filtered subscriptions to determine which messages are of interest. For example, you can set up a topic that contains orders from high-priority customers.
You can join topics of different message types, and you can project messages of a different type than the underlying topic.
To create an aggregate using multiple topics, each topic needs to maintain a SOW. Since views maintain an underlying SOW, you can create views from views.
To define an aggregate, you decide:
- The topic, or topics, that contain the source for the aggregation
- If the aggregation uses more than one topic, how those topics relate to each other
- What messages to publish, or project, from the aggregation
- How to group messages for aggregation
- The message type of the aggregation
Message types provided with AMPS fully support views, with the following exceptions:
binary
message types cannot be an underlying topic for a view or the type of a viewprotobuf
message types can be the underlying topic for a view, but cannot be the type of a viewcomposite-global
message types can be the underlying topic for the view, but cannot be the type of the view
If you are using a custom message type, check with the message type developer as to whether that message type supports aggregation.
Single Topic Aggregation: UnderlyingTopic¶
For aggregations based on a single topic, use the UnderlyingTopic
element to tell AMPS which topic to use. All messages from the
UnderlyingTopic
will appear in the aggregation.
<UnderlyingTopic>MyOriginalTopic</UnderlyingTopic>
Multiple Topic Aggregation: Join¶
Join
definitions tell AMPS how to relate underlying topics to each
other. You use a separate Join
element for each relationship in the
view. Most often,the join definition describes a relationship between
topics:
[topic].[field]=[topic].[field]
The topics specified must be previously defined in the AMPS
configuration file. The square brackets []
are optional. If they are
omitted, AMPS uses the first /
in the expression as the start of the
field definition. You can use any number of join expressions to define a
multiple topic aggregation.
A Join
definition is an equality comparison between the values of
two fields. The Join
definition is not evaluated as an AMPS expression,
so functions, operators (other than =
) and so forth are not evaluated
in these definitions.
Within a Join
definition, values are always compared as strings.
This means that values such as 12345
, 12345.00
, and
1.2345E+04
can be considered to be different values by the Join
expression since these are different strings, even though these strings
contain the same numeric value.
If your aggregation will join messages of different types, or produce messages of a different type than the underlying topics, you add message type specifiers to the join definition:
[messagetype].[topic].[field]=[messagetype].[topic].[field]
In this case, the square brackets []
around the messagetype are
mandatory. AMPS creates a projection in the aggregation that combines
the messages from each topic where the expression is true. In other
words, for the expression:
<Join>[Orders].[/CustomerID]=[Addresses].[/CustomerID]</Join>
AMPS projects every message where the same CustomerID
appears in
both the Addresses
topic and the Orders
topic. If a
CustomerID
value appears in only the Addresses
topic, AMPS does
not create a projection for the message. If a CustomerID
value
appears in only the Orders
topic, AMPS projects the message with
NULL
values for the Addresses
topic. In database terms, this is
equivalent to a LEFT OUTER JOIN
.
You can use any number of Join
definitions in an underlying topic:
<Join>[nvfix].[Orders].[/CustomerID]=[json].[Addresses].[/CustomerID]</Join>
<Join>[nvfix].[Orders].[/ItemID]=[nvfix].[Catalog].[/ItemID]</Join>
In this case, AMPS creates a projection that combines messages from the
Orders
, Addresses
, and Catalog
topics for any published
message where matching messages are present in all three topics. Where
there are no matching messages in the Catalog
and Addresses
topics, AMPS projects those values as NULL
.
A Join element can also contain only one topic. In this
case, all messages from that topic are included in the view. |
Setting the Message Type¶
The MessageType
element of the definition sets the type of the
outgoing messages. The message type of the aggregation does not need to
be the same as the message type of the topics used to create the
aggregation. However, if the MessageType
differs from the type of
the topics used to produce the aggregation, you must explicitly specify
the message type of the underlying topics.
For example, to produce JSON messages regardless of the types of the topics in the aggregation, you would use the following element:
<MessageType>json</MessageType>
Defining Projections¶
AMPS makes available all fields from matching messages in the join specification. You specify the fields that you want AMPS to project and how to project them.
To tell AMPS how to project a message, you specify each field to include in the projection. The specification provides a name for the projected field and one or more source field to use for the projected field. The data can be projected as-is, or aggregated using one of the AMPS aggregation functions, as described in Chapter 4 Aggregate Functions.
You refer to source fields using the XPath-like expression for the field. You name projected fields by creating an XPath-like expression for the new field. AMPS uses this expression to name the new field.
<Projection>
<Field>[Orders].[/CustomerID]</Field>
<Field>[Addresses].[/ShippingAddress] AS /DestinationAddress</Field>
<Field>SUM([Orders].[/TotalPrice]) AS /AccountTotal</Field>
</Projection>
The sample above uses the CustomerID
from the orders topic and the
shipping address for that customer from the Addresses
topic. The
sample calculates the sum of all of the orders for that customer as the
AccountTotal
. The sample also renames the ShippingAddress
field
as DestinationAddress
in the projected message.
For more information on constructing fields in a view, see Chapter 4 Constructing Fields.
Data Types and Projections¶
When projecting views, AMPS converts the original values into the AMPS internal type system and serializes those values into a new message. This approach allows AMPS to efficiently aggregate messages of different types and produce predictable results. The data type of the serialization is determined by the message type of the projected message: the message types provided by 60East in this release project the AMPS internal type.
This means that, for message types that rely on type markers to identify
the type (such as bson
), the type of the field in the projected
message may reflect the AMPS internal type rather than the original
type. This conversion is typically a widening conversion for numeric
types (for example, input typed as a 32-bit integer will typically be
widened to a 64-bit integer). For other types, the most common
conversion is from a specific data type (such as regular expression) to
a string type.
For details on the AMPS data types, see the section called “AMPS Data Types”.
Grouping¶
Use grouping statements to tell AMPS how to aggregate data across messages and generate projected messages.
For example, an Orders
topic that contains messages for incoming
orders could be used to calculate aggregates for each customer, or
aggregates for each symbol ordered. The grouping statement tells AMPS
which way to group messages for aggregation.
<Grouping>
<Field>[Orders].[/CustomerID]</Field>
</Grouping>
The sample above groups and aggregates the projected messages by
CustomerId
. Because this statement tells AMPS to group by
CustomerId, AMPS projects a message for each distinct CustomerId
value. A message to the Orders topic will create an outgoing message
with data aggregated over the CustomerId
.
Each field in the projection should either be an aggregate or be
specified in the Grouping
element. Otherwise, AMPS returns the last
processed value for the field.
Unlike ANSI SQL, AMPS allows you to include fields in the
projection that are not included in the Grouping or used
within the aggregate functions. In this case, AMPS uses the last
value processed for the value of these fields. AMPS enforces a
consistent order of updates to ensure that the value of the
field is consistent across recovery and restart. |
Inline Update Conflation¶
AMPS has the ability to conflate updates to a view. Conflation is particularly useful when a view receives a high velocity of updates and subscribers to the view have no need to track every update, but instead want to see the current state of the view as quickly as possible. For applications that have a high update rate and relatively complicated view processing, inline conflation can significantly reduce the total number of updates processed for the view and increase overall throughput.
Inline conflation changes how AMPS manages pending updates for a view. Without inline conflation enabled for a view, AMPS processes all messages for a view strictly in the order in which those messages were published. Even if there are multiple updates to the same record pending, AMPS processes each of those messages in turn and updates the view for each message.
When inline conflation is enabled and message arrives with the same
Grouping
value as a message waiting to be processed, AMPS replaces
the pending message with the new message, and only processes the new
message. Inline conflation does not cause AMPS to slow down the rate
at which AMPS processes updates for a view. AMPS continues to process
updates for the view as fast as possible, and makes no guarantees as to
the number of updates to a view produced by a given set of updates to an
underlying topic.
The diagram below shows a simplified representation of inline conflation
for a view where the underlying SOW uses the id
field of the message as
the Key
. With conflation set to none
(the default for a view), each
message is added to the end of the messages waiting to be processed, whether
or not an update for that group is already waiting. Both updates are processed.
By contrast, when conflation is set to inline
, if there is an existing
update waiting, the new update replaces the existing update, and only
the new update is processed.
Because inline conflation replaces messages while processing is pending, the following considerations apply to views that enable inline conflation:
- Not every update to underlying topic will produce an individual update to the view: when multiple updates occur to the same record in a short period of time, AMPS may only process the last update.
- Updates to the view may be produced in an order different than the order in which the messages were published to the underlying topic, since AMPS replaces messages waiting to be processed
- The final state of the view will be exactly as if each update were processed, since it will be based on the latest values in the underlying topic (or topics)
To enable inline conflation, add the Conflation
element to the
configuration for the View
, as shown below:
<SOW>
<View>
...
<Conflation>inline</Conflation>
...
</View>
</SOW>
Filtering Single Topic Aggregations¶
When a view aggregates a single topic, you can use a Filter
element
in the view definition to limit the messages included in the view to
only those messages that match the filter. For example, to aggregate
only messages from an underlying topic where the /status
is
complete, you could define your view as follows:
<SOW>
...
<Topic>
<Name>orders</Name>
<MessageType>json</MessageType>
<Key>/orderId</Key>
<FileName>./sow/%n.sow</FileName>
</Topic>
<View>
<Name>CompleteByRegion</Name>
<UnderlyingTopic>orders</UnderlyingTopic>
<MessageType>json</MessageType>
<Projection>
<Field>COUNT(/orderId) AS /completedOrders</Field>
<Field>/region AS /region</Field>
</Projection>
<Grouping>
<Field>/region</Field>
</Grouping>
<Filter>/status = 'complete'</Filter>
</View>
...
</SOW>
The Filter
element is not supported for multiple topic aggregation.
Constructing Fields¶
The AMPS expression language is used to construct fields in aggregates, as described in Chapter 4 Constructing Fields.
Examples¶
Simple Aggregate View Example¶
For a potential usage scenario, imagine the topic ORDERS
which
includes the following NVFIX message schema:
NVFIX Tag | Description |
---|---|
OrderID | unique order identifier |
Tick | symbol |
ClientId | unique client identifier |
Shares | currently executed shares for the chain of orders |
Price | average price for the chain of orders |
Table 13.1: ORDERS Table Identifiers
This topic includes information on the current state of executed orders,
but may not include all the information we want updated in real-time.
For example, we may want to monitor the total value of all orders
executed by a client at any moment. If ORDERS
was a SQL Table within
an RDBMS, the “view” we would want to create would be similar to:
CREATE VIEW TOTAL_VALUE AS
SELECT ClientId, SUM(Shares * Price) AS TotalCost
FROM ORDERS
GROUP BY ClientId
As defined above, the TOTAL_VALUE
view would only have two fields:
- ClientId: the client identifier
- TotalCost: the summation of current order values by client
Views in AMPS are specified in the AMPS configuration file in View
sections, which are defined in the SOW
section. The example above
would be defined as:
<SOW>
<Topic>
<Name>ORDERS</Name>
<MessageType>nvfix</MessageType>
<Key>/OrderID</Key>
<FileName>./sow/%n.sow</FileName>
</Topic>
<View>
<Name>TOTAL_VALUE</Name>
<UnderlyingTopic>ORDERS</UnderlyingTopic>
<MessageType>nvfix</MessageType>
<Projection>
<Field>/ClientId</Field>
<Field>SUM(/Shares * /Price) AS /TotalCost</Field>
</Projection>
<Grouping>
<Field>/ClientId</Field>
</Grouping>
</View>
</SOW>
Views require an underlying SOW topic. See State of the World (SOW) for more information on creating and configuring SOW topics. |
The Topic
element is the name of the new topic that is being
defined. This Topic
value will be the topic that can be used by
clients to subscribe for future updates or perform SOW queries against.
The UnderlyingTopic
is the SOW topic or topics that the view
operates on. That is, the UnderlyingTopic
is where the view gets its
data from. All XPath references within the Projection
fields are
references to values within this underlying SOW topic (unless they
appear on the right-hand side of the AS
keyword.)
The Projection
section is a list of 1 or more Field
s that
define what the view will contain. The expressions can contain either a
raw XPath value, as in “/ClientId” above, which is a straight copy of
the value found in the underlying topic into the view topic using the
same target XPath. If we had wanted to translate the ClientId
tag
into a different tag, such as CID
, then we could have used the
AS
keyword to do the translation as in /ClientId AS /CID
.
Unlike ANSI SQL, AMPS allows you to include fields in the
projection that are not included in the Grouping or used
within the aggregate functions. In this case, AMPS uses the last
value processed for the value of these fields. AMPS enforces a
consistent order of updates to ensure that the value of the
field is consistent across recovery and restart. |
An unexpected 0 (zero) in an aggregate field within a view
usually means that the value is either zero or NaN . AMPS
defaults to using 0 instead of NaN . However, any numeric
aggregate function will result in a NaN if the aggregation
includes a field that is not a number. |
Finally, the Grouping
section is a list of one or more Field
‘s
that define how the records in the underlying topic will be grouped to
form the records in the view. In this example, we grouped by the tag
holding the client identifier. However, we could have easily made this
the “Symbol” tag /Tick
.
In the below example, we group by the /ClientId
because we want to
count the number of orders for each client that have a value greater
than 1,000,000:
<SOW>
...
<View>
<Name>NUMBER_OF_ORDERS_OVER_ONEMILL</Name>
<UnderlyingTopic>ORDERS</UnderlyingTopic>
<Projection>
<Field>/ClientId</Field>
<Field>SUM(IF(/Shares * /Price > 1000000, /Shares * /Price, NULL)) AS /AggregateValue2</Field>
</Projection>
<Grouping>
<Field>/ClientId</Field>
</Grouping>
<FileName>./views/numOfOrdersOverOneMil.view</FileName>
<MessageType>nvfix</MessageType>
</View>
...
</SOW>
Notice that the /AggregateValue
and /AggregateValue_2
will
contain the same value; however /AggregateValue
was defined using an
XML CDATA
block, and /AggregateValue_2
was defined using the XML
>
entity reference.
Since the AMPS configuration is XML, special characters in projection expressions must either be escaped with XML entity references or wrapped in a CDATA section. |
Updates to underlying topics can potentially cause many more updates to downstream views, which can create stress on downstream clients subscribed to the view. If any underlying topic has frequent updates to the same records and/or a real-time view is not required, as in a GUI, then a replica of the topic may be a good solution to reduce the frequency of the updates and conserve bandwidth. For more on topic replicas, please see Chapter 11 Conflated Topics.
Multiple Topic Aggregate Example¶
This example demonstrates how to create an aggregate view that uses more
than one topic as a data source. For a potential usage scenario, imagine
that another publisher provides a COMPANIES
topic which includes the
following NVFIX message schema:
NVFIX Tag | Description |
---|---|
CompanyId | unique identifier for the company |
Tick | symbol |
Name | company name |
Table 13.2: COMPANIES Table Identifiers
This topic includes the name of the company, and an identifier used for internal record keeping in the trading system. Using this information, we want to provide a running total of orders for that company, including the company name.
If ORDERS
and COMPANIES
were a SQL Table within an RDBMS, the
“view” we would want to create would be similar to:
CREATE VIEW TOTAL_COMPANY_VOLUME AS
SELECT COMPANIES.CompanyId, COMPANIES.Tick, COMPANIES.Name, SUM(ORDERS.Shares) AS TotalVolume
FROM COMPANIES LEFT OUTER JOIN ORDERS
ON COMPANIES.Tick = ORDERS.Tick
GROUP BY ORDERS.Tick
As defined above, the TOTAL_COMPANY_VOLUME
table would have four
columns:
- CompanyId: the identifier for the company
- Tick: The ticker symbol for the company
- Name: The name of the company
- TotalVolume: The total number of shares involved in orders
To create this view, use the following definition in the AMPS configuration file:
<SOW>
<Topic>
<Name>ORDERS</Name>
<MessageType>nvfix</MessageType>
<Key>/OrderID</Key>
<FileName>./sow/%n.sow</FileName>
</Topic>
<Topic>
<Name>COMPANIES</Name>
<MessageType>nvfix</MessageType>
<Key>/CompanyId</Key>
<FileName>./sow/%n.sow</FileName>
</Topic>
<View>
<Name>TOTAL_COMPANY_VOLUME</Name>
<UnderlyingTopic>
<Join>[ORDERS]./Tick = [COMPANIES]./Tick</Join>
</UnderlyingTopic>
<FileName>./views/totalVolume.view</FileName>
<MessageType>nvfix</MessageType>
<Projection>
<Field>[COMPANIES]./CompanyId</Field>
<Field>[COMPANIES]./Tick</Field>
<Field>[COMPANIES]./Name</Field>
<Field>SUM([ORDERS]./Shares) AS /TotalVolume</Field>
</Projection>
<Grouping>
<Field>[ORDERS]./Tick</Field>
</Grouping>
</View>
</SOW>
As with the single topic example, first specify the underlying topics
and ensure that they maintain a SOW database. Next, the view defines the
underlying topic that is the source of the data. In this case, the
underlying topic is a join between two topics in the instance. The
definition next declares the file name where the view will be saved, and
the message type of the projected messages. The message types that you
join can be different types, and the projected messages can be a
different type than the underlying message types. The projection uses
three fields from the COMPANIES topic and one field that is aggregated
from messages in the ORDERS topic. The projection groups results by the
Tick
symbols that appear in messages in the ORDERS topic.
View Projected Into Different Message Type¶
This example shows how to project an underlying topic of one message type into a topic of a different message type.
There is very little difference between this example and the single topic view in Simple Aggregate View Example. The main difference is that, because the destination view has a different message type than the underlying topic, every reference to a field from the underlying topic must be fully-qualified with the message type.
As before, imagine the topic ORDERS
which includes the following
NVFIX message schema:
NVFIX Tag | Description |
---|---|
OrderID | unique order identifier |
Tick | symbol |
ClientId | unique client identifier |
Shares | currently executed shares for the chain of orders |
Price | average price for the chain of orders |
Table 13.3: ORDERS Table Identifiers
As before, we want to project the summation of current order values by client. The TOTAL_VALUE view will have two fields:
- ClientId: the client identifier
- TotalCost: the summation of current order values by client
However, in this case, we want to project the summary into a JSON document. To do this we simply specify that the final view will be in JSON format, and fully qualify all references to the underlying topic in the view definition.
The example above would be defined as:
<SOW>
<Topic>
<Name>ORDERS</Name>
<MessageType>nvfix</MessageType>
<Key>/OrderID</Key>
<FileName>./sow/%n.sow</FileName>
</Topic>
<View>
<Name>TOTAL_VALUE</Name>
<UnderlyingTopic>[nvfix].[ORDERS]</UnderlyingTopic>
<MessageType>json</MessageType>
<Projection>
<Field>[nvfix].[ORDERS]./ClientId AS /ClientId</Field>
<Field>SUM([nvfix].[ORDERS]./Shares * [nvfix].[ORDERS]./Price) AS /TotalCost</Field>
</Projection>
<Grouping>
<Field>[nvfix].[ORDERS]./ClientId</Field>
</Grouping>
</View>
</SOW>
This example uses an underlying topic in NVFIX format, computes an
aggregation by ClientId
, and then produces output in JSON format.
Aggregated Subscriptions¶
In addition to precomputed views and aggregates, AMPS provides the ability for the server to compute an aggregation for an individual subscription. When an application requests an aggregated subscription, rather than providing messages for the subscription verbatim, the AMPS server will calculate the requested aggregates and produce a message that contains the aggregated data.
Most of the time, AMPS applications use views to provide aggregation, as described in section Understanding Views. AMPS views are shared across subscriptions, and are calculated once, when a message updates a view, regardless of the number of subscribers that subscribe to the view. AMPS provides aggregated subscriptions as a way to do ad hoc aggregation in cases where a specific aggregate is only needed for a short period time, will only be used by a single subscriber, or must be provided before the server can be restarted with a defined view. If the aggregation is frequently used, or if multiple subscribers will use the aggregation, consider using a view rather than an aggregated subscription.
To request an aggregated subscription, the subscriber provides a definition of the fields to project and the grouping to apply with each subscription. AMPS performs the aggregation and constructs the specified message before delivering the message.
For example, imagine a topic in the SOW that uses the /id
field to
create the SOW key. The topic contains the following messages:
{ "id":1, "tickerId" : "IBM", "price" : 150.34 }
{ "id":2, "tickerId" : "IBM", "price" : 149.76 }
{ "id":3, "tickerId" : "IBM", "price" : 149.32 }
{ "id":4, "tickerId" : "IBM", "price" : 151.10 }
A subscriber enters a SOW query with the following options:
projection=[MAX(/price) AS /max,/tickerId as /ticker],grouping=[/tickerId]
AMPS aggregates the messages in the SOW, and delivers the following projected record:
{ "ticker" : "IBM", "max" : 151.10 }
Aggregated subscriptions are supported for commands that use the
State of the World:
sow
, sow_and_subscribe
, and sow_and_delta_subscribe
. However, there
are limitations on some variants of the commands, as described in the following sections.
The memory consumed to maintain an aggregated subscription is counted as part of the total
memory for the client that submitted the subscription when considering the
MessageMemoryLimit
for that client.
When to Use Aggregated Subscriptions¶
Aggregated subscriptions require AMPS to compute the aggregate for each subscription individually, at the time that messages are processed for the subscription. In addition, for aggregated subscriptions, the current state of the aggregation is retained for each subscription.
In cases where more than one subscriber is using the same aggregation, a
View
is more efficient: each record in the view is only computed
once, saving CPU cycles, and ongoing updates for the record are only
stored once, requiring less memory. Likewise, if the aggregation uses
more than one topic or aggregates messages of a different type than
the final result, you must configurea View
on the server.
An aggregated subscription is most appropriate if one or more of the following is true:
- A subscription has unique and unpredictable aggregation needs. For example, if no other subscription is computing a given aggregation, and it is not possible to predict in advance the aggregates to compute, then per-subscription aggregation is a good solution.
- The application is under development and iterating quickly. It can be convenient to use aggregated subscriptions while developing aggregate definitions that will be eventually provided as view topics.
- The aggregation is expensive and seldom needed. For example, if an aggregation is memory-intensive and only needed once a week at a time when the instance is otherwise lightly-used, the overall memory usage of the AMPS instance may be reduced during the rest of the week by using an aggregated subscription.
The considerations above are general guidance to help you consider options between per-subscription aggregation and a persistent view. In general, if it is possible to use an AMPS view for a given aggregation task and that view will be frequently used, a view is often the best option. If a view cannot be used (because the aggregation is not known in advance) or the view would seldom be used, an aggregated subscription may be a better option.
Requesting an Aggregated Subscription¶
To request an aggregated subscription, set the following options on the subscription:
Option | Description |
---|---|
projection=[field specifications] |
Specifies a comma-delimited set of fields to project, within brackets. Each entry has the format described in Constructing Fields. This option must contain an entry for every field in the aggregated message. If there is no entry for a field in this option, that field will not appear in the aggregated message, even if the field is in the underlying message. For example, to project the total value of orders for a specific
item, you might take the sum of the projection=[SUM(/price * /quantity) AS /total, /description]
When a field appears in the projection option, but is not part of a grouping clause or used in an aggregation function, the message will have the value of that field in the last message processed by AMPS. There is no default for this option. When this option is provided, a
|
grouping=[keys] |
For an aggregated subscription, the format of this option is a
comma-delimited list of XPath identifiers within brackets. For
example, to aggregate entries based on their grouping=[/description]
There is no default for this option. When this option is provided, a
|
Table 12.4: Aggregated Subscription options
For example, to request a count, by customer, of the order records stored in a topic in the SOW, you could use the following options:
projection=[COUNT(/orderId)AS /orderCount, /customer AS /customer],grouping=[/customer]
Considerations for Aggregated Subscriptions¶
When planning to use an aggregated subscription, the following considerations apply:
- The topic for the subscription must be a topic in the State of the World. This includes views and the SOW view of a queue.
- The topic for the subscription must not be a regular expression.
- When subscribing to a queue, an aggregated subscription does not remove messages from the queue. Like a view definition with the queue as an underlying typic, an aggregated subscription browses the queue without taking messages from the queue.
- Filters for the subscription apply to the original messages, not
the results of the projection. A filter for an aggregated
subscription is equivalent to the
Filter
element in a View definition rather than a filter for a subscription that uses the view. - A subscription that uses per-subscription aggregation does not
support the
replace
option except for changing pagination options. - An aggregated subscription cannot be a bookmark subscription. That is, replay from the transaction log does not support aggregated subscriptions.