16. Message Types¶
Message communication between the publisher and subscriber in AMPS is managed through the use of message types. Message types define the data contained within an AMPS message. Each topic has a specific message type. Transports used for publishers and subscribers can also define specific message types. For a given transport, AMPS only process messages of the type or types that the transport accepts.
When AMPS needs to use the data within a message, AMPS uses the message type to parse the message into an internal representation. AMPS uses the same internal representation for all message types. Likewise, if AMPS needs to create a new message from a set of values (for example, for a view), AMPS uses the message type to serialize that set of values into the correct format. AMPS filters, commands, processing flow, and so forth are the same for every message type. Message types do not change how AMPS processes messages. A message type simply allows AMPS to work with data of a particular format.
In some cases, a given message type cannot support all of the
capabilities in AMPS. For example, the unparsed binary
message type
allows arbitrary payloads. This can be extremely useful, but because
there is no set format for that message type, none of the capabilities
that rely on parsing data are supported by the binary
message type.
Where a message type cannot provide a specific capability to AMPS, those
limitations are described below.
Except where limitations are described in this section, all message types provided with the AMPS server support all AMPS features. The AMPS engine itself is message-type agnostic. There is no difference in configuring a SOW that uses a composite type than there is configuring a SOW that uses JSON, or BFlat, or Google Protocol buffers.
Message types in AMPS are implemented as plug-in modules. For more information on plug-in modules, contact 60East support for access to the AMPS Server SDK.
Default Message Types¶
AMPS automatically loads modules for the following message types:
Message Type | Description |
---|---|
bson |
Binary JSON (BSON) messages. See http://www.bsonspec.org for information on this format. |
bflat |
BFlat, a schemaless message format based on key-value pairs that includes support for binary representations of numeric data. See http://bflat.io for information on this format. |
fix |
FIX messages using numeric tags. FIX is a standard format widely used in the financial industry. See http://www.fixtradingcommunity.org/pg/main/what-is-fix for more information on this format. |
json |
JSON (JavaScript Object Notation) messages. See http://www.json.org for information on this format. |
msgpack |
MessagePack messages. MessagePack is a schemaless serialization format designed to efficiently encode data. See http://msgpack.org/index.html for more information on MessagePack. |
nvfix |
NVFIX (name/value FIX) messages. NVFIX uses the same basic format as FIX, but allows tags to
contain any byte that is not = or the configured field separator character (by default,
the ASCII SOH character.) By contrast, FIX requires that tags are numeric. |
xml |
XML messages (of any schema) |
binary |
Uninterpreted binary payload. Because this module does not attempt to parse the payload, it
does not support content filtering, views and aggregates. Likewise, because there is no set
format for the payload, this message type cannot support features that construct messages
(such as delta messaging, /AMPS/.* topic subscriptions and stats acks). |
protobuf |
Google protocol buffer messages. To use this message type, you must configure a
MessageType with the format of the messages (the .proto files). |
Table 16.1: AMPS Default Message Types
With these message types, AMPS automatically loads the module that
provides the message type. AMPS declares message types for all of the
above message types except for protobuf
.
For efficiency, AMPS only parses the content of a message if required,
and only to the extent required. For example, if AMPS only needs to find
the id
tag in an NFVIX message, AMPS will not fully parse the
message, but will stop parsing the message after finding the id
tag.
This provides significant performance improvements, and also means that
AMPS does not verify the format or validity of messages unless it needs
to parse the messages. When AMPS parses a message, it may only partially
parse a message, and may not detect corruption or invalid format in a
message if that corruption occurs after the point at which AMPS has all
of the required information from the message.
The FIX and NVFIX message types support configuration of the field and message delimiters.
AMPS also allows you to create new message types by assembling existing message types into a composite message. Composite message types are described in Composite Messages, and require additional configuration:
Message Type Name | Description |
---|---|
composite-global |
Composite message type that combines message parts for content filtering. This message type combines one or more existing message types into a message. This type is described in more detail in Composite Messages. |
composite-local |
Composite message type, filterable by individual parts. This message type combines one or more existing message types into a message. This type is described in more detail in Composite Messages. |
Table 16.2: AMPS composite message types
BFlat Messages¶
The BFlat message format combines the simplicity and efficiency of simple, schema-less data formats such as FIX and NVFIX with the ability to manage binary data and preserve the full precision of numeric values. BFlat is especially useful for applications that deal with binary data or precise numeric values while demanding high levels of throughput.
A BFlat message is composed of any number of tag/value pairs, similar to FIX and NVFIX messages. Tags and values can contain any value, and can be of any length: unlike formats such as FIX, there are no reserved characters. In practical terms, the name of a tag must be a valid XPath identifier to filter the message in AMPS. However, this is a limitation of XPath, and not of the BFlat message format.
The BFlat message type supports all AMPS features, and there are no special considerations when using the BFlat message type.
Open-source libraries for producing and parsing BFlat messages are available from the BFlat project site.
BFlat Data Types¶
BFlat messages are strongly typed. BFlat supports a string
type for
string data, and a binary
type for arbitrary binary data. For
numeric values, BFlat can preserve the precise value of the following
numeric types:
Type | Description |
---|---|
int8 | 8-bit integer |
int16 | 16-bit integer |
int32 | 32-bit integer |
int64 | 64-bit integer |
double | 64-bit IEEE 754 floating point number |
datetime | UTC datetime containing milliseconds since Unix epoch (64-bit representation) |
leb128 | Signed LEB128 integer (variable length) |
Table 16.3: BFlat Numeric Types
BFlat also supports arrays of values.
MessagePack Messages¶
AMPS fully supports MessagePack messages, with the following implementation decisions to represent MessagePack messages in the AMPS type system. See AMPS Data Types for information on the AMPS data types. Notice, in particular, that the AMPS expression language supports automatic type conversion, so while this table shows the default AMPS representation for a given MessagePack type, AMPS will convert a value as needed once the message has been parsed.
MessagePack type | AMPS representation |
---|---|
nil | NULL |
bool | Boolean |
int (all widths) | Integer |
float (all widths) | Float |
str (all widths) | String |
bin (all widths) | String |
array (all widths) | array of AMPS values |
map (all widths) | nested AMPS values |
ext (all widths) | String |
Table 16.4: Messagepack Types and AMPS Types
Notice that AMPS does not attempt to interpret extension types, and instead represents them as arrays of bytes (a String in the AMPS type system).
Composite Messages¶
Sometimes, applications only need to filter on a small subset of the fields in a message. Sometimes applications need to send and receive messages that cannot be meaningfully parsed by AMPS, such as images or audio files. For these cases, AMPS provides a composite message type that lets you create a new message type by combining existing message types.
For example, you might create a message type that includes three parts:
the metadata for an image as a json
document, a small JPG thumbnail
as a binary
message part, and a full size PNG image as another
binary
message part.
Composite messages can also be useful when the message itself is large or resource-intensive to parse. In this case, you can create a message type that includes the information needed to filter messages in a JSON or NVFIX part, and include the full message in the unparsed payload of the composite message, as described below.
AMPS provides two different types of composite messages. Messages
created using the composite-local
module preserve information about
the individual parts for filtering, aggregation, and projection.
Messages creating using the composite-global
module treat the
individual parts as elements of a single document.
Configuring Composite Message Types¶
To use a composite message type, you must first configure the type by
declaring it in the MessageTypes
section of the AMPS configuration
file. The declaration contains the name of the new composite message
type, specifies that the new type is composite, and lists the parts of
the composite message type.
For example, the MessageType
element below declares a new composite
message type named images
. The new type contains a json
document
at the beginning of the message, followed by two uninterpreted binary
message parts. AMPS will combine the XPath identifiers for all message
parts into a single set of identifiers. Notice that, because only one
part of the message type is parsable, using composite-global
simplifies the identifiers for the message.
<MessageTypes>
...
<MessageType>
<Name>images</Name>
<Module>composite-global</Module>
<MessageType>json</MessageType>
<MessageType>binary</MessageType>
<MessageType>binary</MessageType>
</MessageType>
...
</MessageTypes>
The MessageType
entries for the composite message can be any AMPS
message type, including both the built-in types and any previously
defined message type.
Once the new composite message type is created, you can use the new type in the configuration file.
Composite message types have the following restrictions:
- Delta subscribe and delta publish are not supported for message types
that use
composite-global
. - Views, joins, and aggregation cannot project message types that use
composite-global
. (However, composite message types that usecomposite-global
can be anUnderlyingTopic
or one of the topics in aJoin
.) - Composite message types do not support features that automatically
construct messages, such as subscriptions the
AMPS/.*
topics and stats acks, regardless of the module the type uses.
Unparsed Payload Section¶
All composite message types, regardless of how they are defined, provide
an unparsed payload section. The unparsed payload section does not
need to be declared in the MessageType
declaration. As the name
suggests, AMPS does not parse or interpret this section, so the unparsed
payload can contain any content of any type. The AMPS clients provide
access to set the unparsed payload on outgoing messages, and to retrieve
the unparsed payload from incoming messages.
The unparsed payload is included to simplify the common technique where a message type contains a header that is used for filtering followed by an unparsed binary. If your composite message type contains a single binary part, consider using the unparsed payload section in your application rather than declaring a binary message part.
Content Filtering with Composite Message Types¶
Composite message types support filtering on the contents of the composite message. There are some simple conventions to remember when constructing expressions to filter on. For more details about content filtering, see Filtering Subscriptions by Content.
These conventions are consistent anywhere that AMPS needs to find a value within the composite message type. That includes content filters for client subscriptions, identifying SOW keys, creating views and aggregates, creating conflated topics, and so on.
composite-global¶
When using the composite-global
message type, AMPS combines all
parts of the message into a unified set of XPath identifiers. AMPS
creates the set of identifiers for each part of the message. If
different parts of the message contain the same identifier, AMPS treats
that identifier as though the identifier contained an array of values:
AMPS creates an array that contains all of the values in the different
parts of the message. Message types that do not support content
filtering do not provide XPath identifiers.
For example, consider the message below for a composite-global
message type that includes two json
parts and a binary
part:
{"id":1,"data":"sample","message":"part one message"}
{"message":"another part","customer":"Awesome Amalgamated, Ltd."}
0xDEEA0934DF23A37780934...
AMPS constructs the following set of XPath identifiers and values:
Identifier | Value |
---|---|
/id |
1 |
/data |
"sample" |
/message |
["part one message", "another part
"] |
/customer |
"Awesome Amalgamated, Ltd." |
Table 16.5: Composite-global message identifiers
In short, when using composite-global
, AMPS combines the parsable
parts of the message into a single global set of XPath values, and
ignores any part of the message that cannot be parsed.
composite-local¶
When using the composite-local
message type, AMPS creates a distinct
set of XPath identifiers for each part of the message. AMPS adds an
XPath step with the position of the message part at the beginning of the
identifier. Message types that do not support content filtering do not
provide XPath identifiers, and AMPS skips over them.
For example, consider the message below for a composite-local
message type that includes two json
parts and a binary
part:
{"id":1,"data":"sample","message":"part one message"}
{"message":"another part","customer":"Awesome Amalgamated, Ltd."}
0xDEEA0934DF23A37780934...
AMPS constructs the following set of XPath identifiers and values:
Identifier | Value |
---|---|
/0/id |
1 |
/0/data |
"sample" |
/0/message |
"part one message" |
/1/message |
"another part" |
/1/customer |
"Awesome Amalgamated, Ltd." |
Table 16.6: Composite-local message identifiers
In short, when using composite-local
, AMPS creates XPath identifiers
for each part of the message, using the position of the message part
within the composite as the first part of the identifier. AMPS skips
over any part of the message that cannot be parsed, and simply produces
no values for that part of the message.
Choosing A Composite Type¶
To choose which composite type best fits your application, consider the following factors:
- If you need to use delta messaging with this message type, use
composite-local
. - If there may be redundant field names in the parts of the message,
and it is important to be able to filter based on which part contains
the field, use
composite-local
. - If you need to be able to create views of this type, use
composite-local
.
Otherwise, composite-global
may be easier and more straightforward
for client filtering, since clients do not need to know the detailed
structure of the message type to be able to filter on the message.
Protobuf Message Types¶
Protocol buffers, or protobufs for short, is an efficient, automated mechanism for serializing structured data. AMPS supports Google protobuf messages (version 2 and version 3) as a message format.
Because Google protocol buffers use a fixed format for messages, to use
protobuf, you must configure AMPS with the definition of the messages
AMPS will process. This involves defining a MessageType
. You must
define a MessageType
for AMPS to be able to parse protobuf messages.
60East recommends that the .proto
files used with AMPS explicitly
declare the protocol buffer syntax version used. If there is no explicit
declaration, AMPS assumes the file uses protocol buffer 2 syntax.
The AMPS engine is message-type agnostic. Except for the limitations described in this section, there is no difference to the AMPS engine between message types that use protocol buffers and other message types such as JSON or XML or FIX.
Configuring Protobuf Message Types¶
To use a protobuf message, you must first edit the configuration file to
include a new MessageType
. Then, specify the path to the protobuf
file and the name of the protobuf file itself inside the
MessageType
. Below is a sample configuration of a protobuf message
type:
...
<MessageType>
<Name>my-protobuf-messages</Name>
<Module>protobuf</Module>
<ProtoPath>proto-archive;/mnt/shared/protofiles</ProtoPath>
<ProtoFile>proto-archive/person.proto</ProtoFile>
<Type>MyNamespace.Message</Type>
</MessageType>
...
Each message type references a ProtoFile
, and specifies a single
top-level type from the file. The ProtoFile
may include other files
through the standard protocol buffer include mechanism. Likewise, the
top-level type may be any valid protocol buffer definition, including
definitions that contain other types.
Once the protocol buffer MessageType
is created as described
above, you must either create a Transport
that specifies that
message type exactly, or you must create a Transport
that can
accept any known message type and ensure that the client specifies the
new message type (in the example case, my-protobuf-message
) in the
connect string.
When creating a protobuf
message type, you must provide the
following parameters:
Parameter | Description |
---|---|
Name
|
The name of the new, customized message type. The rest of the configuration file will use this name to refer to the message type. |
Module
|
The module that contains the message type. Use protobuf for protocol buffer
messages. |
ProtoPath
|
The path in which to search for alias ; full-path
The alias provides a short identifier to use when searching for .proto files. The full path is the path that is substituted for that identifier. For example, in the sample above, A configuration may omit the alias, and simply provide the path. For example: ;/mnt/repository/protodefs
You may specify any number of |
ProtoFile
|
The name of the .proto file to use for this message type. To use an alias, prefix
the name of the file with the alias, as shown in the example above. |
Type
|
The name of the type inside the .proto file to use for this message type. AMPS
requires a single type. |
Table 16.7: protobuf Message Type Parameters
Filtering with Protobuf Messages¶
To filter protobuf messages, there are a couple of conventions you must remember. AMPS XPath identifiers begin at the outermost message, so you can simply use member names for that message. If you have nested messages, you use the name of the nested message and the member name when creating an XPath identifier.
For example, suppose you have the following definition in a
.proto
file:
message person {
required string name = 1;
required int32 personID = 2;
}
To access the personID
data member, you simply use the name of the
data member as the XPath identifier. An example filter that verifies
that a personID
is greater than 1000 would be:
/personID > 1000
If you have nested messages, you simply provide the path to the nested message you want to access.
Let’s assume that that the person
message from the above example was
nested inside another message with the name of record
. The example
filter below shows how to access the nested person
message, and then
filter to the personID
:
/person/personID > 1000
In this case, the first part of the identifier (/person
) specifies
the submessage. The second part of the identifier (/personID
)
specifies the field within that submessage. Notice that, as always,
there is no need to specify the name of the message for the outermost
message.
Working With Multiple Protocol Buffer Types¶
Some applications require messages of different types: for example, an inventory management system may work with customer records, inventory records, and shippping order records.
When using protocol buffers, each of these messages would use a
different .proto
file, and therefore would be a different
message type. Unlike a self-describing format such as JSON or
XML, the serialized form of a protocol buffer message type
does not automatically contain any information about the type
of message or the fields that the message contains. Therefore,
each protocol buffer message type is best considered as a
completely distinct type. For example, the parser created for
an order record and the parser created for a customer record
are different. Unlike self-describing formats, it is not possible
to use a single parser for these types, or for a parser to
correctly handle a previously-unknown message structure.
There are two approaches to working with multiple protocol buffer types in an AMPS application:
- Keep the message types distinct. Each message type requires
a separate connection to AMPS. The advantage of this approach
is that the
.proto
files can be maintained and updated separately. Each connection has a distinct type, and only needs to handle messages of that type. The disadvantage of this approach is that the application must make a connection to AMPS for each type of message received. - Create a “container” type that can
optionally contain any of the needed message types. The
advantage of this approach is that this requires only a
single connection to AMPS. Since there is a single
“container” type, a topic can hold this “container” type
and have heterogenous actual contents. The disadvantage
to this approach is that it requires a consumer to
understand the “container” type, and changes to the
contained types may need to be carefully managed across
the consumers that use the container. A “container” type
is typically a
oneof
of the contained types.
For example, you might define a container as follows:
message Container {
oneof {
Order order_type = 1;
Payment payment_type = 2;
}
}
message Order {
required string customer_id = 1;
...
}
message Payment {
required string customer_id = 1;
...
}
In this case, the container type will include either an
Order
or a Payment
.
Union Types¶
When using a protocol buffer message type that contains a union, you can navigate the union using the names defined in the top-level element. For example, given the union defined below:
message MyUnion {
optional Order order_type = 1;
optional Payment payment_type = 2;
}
message Order {
required string customer_id = 1;
...
}
message Payment {
required string customer_id = 1;
...
}
Providing a filter of /order_type IS NOT NULL
will return all of the
MyUnion
messages that contain an Order
, while providing a filter
of /payment_type/customer_id = '42'
will return only the MyUnion
messages that contain a Payment
message with a customer_id
of
42
.
Limitations of the protobuf message type¶
Because the protobuf
message type requires a specific, fixed
definition for messages, AMPS does not support operations that construct
messages that may contain arbitrary values. In particular, protobuf does
not support:
- Creating a View with a
protobuf
type as theMessageType
. AMPS allows you to aggregate protobuf messages and project the results as another type, but the destinationMessageType
for a View cannot be aprotobuf
message type. - Creating an aggregated subscription for a topic that contains messages
of a
protobuf
message type. - Subscriptions to AMPS internal topics. Protobuf message types do not
support creating messages for AMPS internal topics, such as
/AMPS/ClientStatus
. - Enriching or preprocessing
protobuf
message types. AMPS does not support enrichment or preprocessing ofprotobuf
messages.
Protocol buffer version 3 messages provide fixed default values for omitted fields. This means that there is no reliable way for AMPS to determine if a missing field has been intentionally left out of the message, or simply contains the fixed default value. The result is an additional limitation for protocol buffer version 3 message types:
- Protocol buffer version 3 message types do not support delta publish or delta subscribe.
Protocol buffer version 2 message types can require that specific fields are provided in a message (that is, fields can be marked required). The result is an additional limitation for protocol buffer version 2 message types:
- Protocol buffer version 2 message types do not support providing a subset of fields in a message by specifying a select list.
There are no other limitations in working with protocol buffer message types.
Working with Optional Default Values¶
Google protocol buffers provide the ability for a message to have fields that are both optional, so they need not be provided in the serialized message, and defaulted, so that there is a specific value interpreted when there is no value provided.
When no value is provided in the serialized message for an optional default value, AMPS interprets the message differently depending on the context:
- For most uses, AMPS interprets the message as though the value is present and set to the default value. This means that you can filter on optional default values, use them as SOW keys, and aggregate optional default values regardless of whether a value is present in the serialized message.
- For delta messaging with protocol buffer version 2, AMPS treats an optional default value as though there is no value present. AMPS does not provide the default value. This means that a delta update must provide the default value explicitly in the serialized message to set the field to the default value. This also means that, if the value present in the message is not the default value, but was not changed on the current update, AMPS will not emit that value in messages to delta subscribers. (Since delta messaging is not supported with protocol buffer version 3, this issue does not arise with that version.)
Loading Additional Message Types¶
AMPS includes the ability to load custom message types in external modules. As with all AMPS modules, custom message types are compiled into shared object files. AMPS dynamically loads these message types on startup, using the information provided in the configuration file. Once you have loaded and declared those types, you can use the type just as you use the default message types.
For example, the configuration below creates a message type named
custom-type
that uses a module named libmy-type-module.so
and
specifies a transport for messages of that type:
<Modules>
<Module>
<!-- Specifies the name to use to refer to this module in the rest of the
configuration file -->
<Name>custom-type-module</Name>
<!-- Path to the library to load for this module. In this example, the path is a
relative path below the directory where AMPS is started. -->
<Library>./custom-modules/libmy-type-module.so</Library>
</Module>
</Modules>
<MessageTypes>
<MessageType>
<!-- The name to use for this message type in the rest of the configuration file. -->
<Name>custom-type</Name>
<!-- Reference to the module that implements this message type, using the Name
defined in the Module configuration. -->
<Module>custom-type-module</Module>
</MessageType>
</MessageTypes>
<Transports>
<Transport>
<Name>custom-type-tcp</Name>
<Type>tcp</Type>
<InetAddr>9008</InetAddr>
<!-- The message type that this transport uses, using the Name defined in the
MessageType configuration. -->
<MessageType>custom-type</MessageType>
<Protocol>amps</Protocol>
</Transport>
</Transports>
Once a message type has been declared, you can use it in exactly the same way you use the default message types.
Notice, however, that custom-developed message types may only provide
support for a subset of the features of AMPS. For example, the
binary
message type provided with AMPS does not support features
that require AMPS to parse or construct a message, as described above.
The developer of the message type must provide information on what
capabilities the message type provides.