16. Message Types

Message communication between the publisher and subscriber in AMPS is managed through the use of message types. Message types define the data contained within an AMPS message. Each topic has a specific message type. Transports used for publishers and subscribers can also define specific message types. For a given transport, AMPS only process messages of the type or types that the transport accepts.

When AMPS needs to use the data within a message, AMPS uses the message type to parse the message into an internal representation. AMPS uses the same internal representation for all message types. Likewise, if AMPS needs to create a new message from a set of values (for example, for a view), AMPS uses the message type to serialize that set of values into the correct format. AMPS filters, commands, processing flow, and so forth are the same for every message type. Message types do not change how AMPS processes messages. A message type simply allows AMPS to work with data of a particular format.

In some cases, a given message type cannot support all of the capabilities in AMPS. For example, the unparsed binary message type allows arbitrary payloads. This can be extremely useful, but because there is no set format for that message type, none of the capabilities that rely on parsing data are supported by the binary message type. Where a message type cannot provide a specific capability to AMPS, those limitations are described below.

Except where limitations are described in this section, all message types provided with the AMPS server support all AMPS features. The AMPS engine itself is message-type agnostic. There is no difference in configuring a SOW that uses a composite type than there is configuring a SOW that uses JSON, or BFlat, or Google Protocol buffers.

Message types in AMPS are implemented as plug-in modules. For more information on plug-in modules, contact 60East support for access to the AMPS Server SDK.

Default Message Types

AMPS automatically loads modules for the following message types:

Message Type Description
bson Binary JSON (BSON) messages. See http://www.bsonspec.org for information on this format.
bflat BFlat, a schemaless message format based on key-value pairs that includes support for binary representations of numeric data. See http://bflat.io for information on this format.
fix FIX messages using numeric tags. FIX is a standard format widely used in the financial industry. See http://www.fixtradingcommunity.org/pg/main/what-is-fix for more information on this format.
json JSON (JavaScript Object Notation) messages. See http://www.json.org for information on this format.
msgpack MessagePack messages. MessagePack is a schemaless serialization format designed to efficiently encode data. See http://msgpack.org/index.html for more information on MessagePack.
nvfix NVFIX (name/value FIX) messages. NVFIX uses the same basic format as FIX, but allows tags to contain any byte that is not = or the configured field separator character (by default, the ASCII SOH character.) By contrast, FIX requires that tags are numeric.
xml XML messages (of any schema)
binary Uninterpreted binary payload. Because this module does not attempt to parse the payload, it does not support content filtering, views and aggregates. Likewise, because there is no set format for the payload, this message type cannot support features that construct messages (such as delta messaging, /AMPS/.* topic subscriptions and stats acks).
protobuf Google protocol buffer messages. To use this message type, you must configure a MessageType with the format of the messages (the .proto files).

Table 16.1: AMPS Default Message Types

With these message types, AMPS automatically loads the module that provides the message type. AMPS declares message types for all of the above message types except for protobuf.

For efficiency, AMPS only parses the content of a message if required, and only to the extent required. For example, if AMPS only needs to find the id tag in an NFVIX message, AMPS will not fully parse the message, but will stop parsing the message after finding the id tag. This provides significant performance improvements, and also means that AMPS does not verify the format or validity of messages unless it needs to parse the messages. When AMPS parses a message, it may only partially parse a message, and may not detect corruption or invalid format in a message if that corruption occurs after the point at which AMPS has all of the required information from the message.

The FIX and NVFIX message types support configuration of the field and message delimiters.

AMPS also allows you to create new message types by assembling existing message types into a composite message. Composite message types are described in Composite Messages, and require additional configuration:

Message Type Name Description
composite-global Composite message type that combines message parts for content filtering. This message type combines one or more existing message types into a message. This type is described in more detail in Composite Messages.
composite-local Composite message type, filterable by individual parts. This message type combines one or more existing message types into a message. This type is described in more detail in Composite Messages.

Table 16.2: AMPS composite message types

BFlat Messages

The BFlat message format combines the simplicity and efficiency of simple, schema-less data formats such as FIX and NVFIX with the ability to manage binary data and preserve the full precision of numeric values. BFlat is especially useful for applications that deal with binary data or precise numeric values while demanding high levels of throughput.

A BFlat message is composed of any number of tag/value pairs, similar to FIX and NVFIX messages. Tags and values can contain any value, and can be of any length: unlike formats such as FIX, there are no reserved characters. In practical terms, the name of a tag must be a valid XPath identifier to filter the message in AMPS. However, this is a limitation of XPath, and not of the BFlat message format.

The BFlat message type supports all AMPS features, and there are no special considerations when using the BFlat message type.

Open-source libraries for producing and parsing BFlat messages are available from the BFlat project site.

BFlat Data Types

BFlat messages are strongly typed. BFlat supports a string type for string data, and a binary type for arbitrary binary data. For numeric values, BFlat can preserve the precise value of the following numeric types:

Type Description
int8 8-bit integer
int16 16-bit integer
int32 32-bit integer
int64 64-bit integer
double 64-bit IEEE 754 floating point number
datetime UTC datetime containing milliseconds since Unix epoch (64-bit representation)
leb128 Signed LEB128 integer (variable length)

Table 16.3: BFlat Numeric Types

BFlat also supports arrays of values.

MessagePack Messages

AMPS fully supports MessagePack messages, with the following implementation decisions to represent MessagePack messages in the AMPS type system. See AMPS Data Types for information on the AMPS data types. Notice, in particular, that the AMPS expression language supports automatic type conversion, so while this table shows the default AMPS representation for a given MessagePack type, AMPS will convert a value as needed once the message has been parsed.

MessagePack type AMPS representation
nil NULL
bool Boolean
int (all widths) Integer
float (all widths) Float
str (all widths) String
bin (all widths) String
array (all widths) array of AMPS values
map (all widths) nested AMPS values
ext (all widths) String

Table 16.4: Messagepack Types and AMPS Types

Notice that AMPS does not attempt to interpret extension types, and instead represents them as arrays of bytes (a String in the AMPS type system).

Composite Messages

Sometimes, applications only need to filter on a small subset of the fields in a message. Sometimes applications need to send and receive messages that cannot be meaningfully parsed by AMPS, such as images or audio files. For these cases, AMPS provides a composite message type that lets you create a new message type by combining existing message types.

For example, you might create a message type that includes three parts: the metadata for an image as a json document, a small JPG thumbnail as a binary message part, and a full size PNG image as another binary message part.

Composite messages can also be useful when the message itself is large or resource-intensive to parse. In this case, you can create a message type that includes the information needed to filter messages in a JSON or NVFIX part, and include the full message in the unparsed payload of the composite message, as described below.

AMPS provides two different types of composite messages. Messages created using the composite-local module preserve information about the individual parts for filtering, aggregation, and projection. Messages creating using the composite-global module treat the individual parts as elements of a single document.

Configuring Composite Message Types

To use a composite message type, you must first configure the type by declaring it in the MessageTypes section of the AMPS configuration file. The declaration contains the name of the new composite message type, specifies that the new type is composite, and lists the parts of the composite message type.

For example, the MessageType element below declares a new composite message type named images. The new type contains a json document at the beginning of the message, followed by two uninterpreted binary message parts. AMPS will combine the XPath identifiers for all message parts into a single set of identifiers. Notice that, because only one part of the message type is parsable, using composite-global simplifies the identifiers for the message.

<MessageTypes>
    ...

    <MessageType>
        <Name>images</Name>
        <Module>composite-global</Module>
        <MessageType>json</MessageType>
        <MessageType>binary</MessageType>
        <MessageType>binary</MessageType>
    </MessageType>

    ...

</MessageTypes>

The MessageType entries for the composite message can be any AMPS message type, including both the built-in types and any previously defined message type.

Once the new composite message type is created, you can use the new type in the configuration file.

Composite message types have the following restrictions:

  • Delta subscribe and delta publish are not supported for message types that use composite-global.
  • Views, joins, and aggregation cannot project message types that use composite-global. (However, composite message types that use composite-global can be an UnderlyingTopic or one of the topics in a Join.)
  • Composite message types do not support features that automatically construct messages, such as subscriptions the AMPS/.* topics and stats acks, regardless of the module the type uses.

Unparsed Payload Section

All composite message types, regardless of how they are defined, provide an unparsed payload section. The unparsed payload section does not need to be declared in the MessageType declaration. As the name suggests, AMPS does not parse or interpret this section, so the unparsed payload can contain any content of any type. The AMPS clients provide access to set the unparsed payload on outgoing messages, and to retrieve the unparsed payload from incoming messages.

The unparsed payload is included to simplify the common technique where a message type contains a header that is used for filtering followed by an unparsed binary. If your composite message type contains a single binary part, consider using the unparsed payload section in your application rather than declaring a binary message part.

Content Filtering with Composite Message Types

Composite message types support filtering on the contents of the composite message. There are some simple conventions to remember when constructing expressions to filter on. For more details about content filtering, see Filtering Subscriptions by Content.

These conventions are consistent anywhere that AMPS needs to find a value within the composite message type. That includes content filters for client subscriptions, identifying SOW keys, creating views and aggregates, creating conflated topics, and so on.

composite-global

When using the composite-global message type, AMPS combines all parts of the message into a unified set of XPath identifiers. AMPS creates the set of identifiers for each part of the message. If different parts of the message contain the same identifier, AMPS treats that identifier as though the identifier contained an array of values: AMPS creates an array that contains all of the values in the different parts of the message. Message types that do not support content filtering do not provide XPath identifiers.

For example, consider the message below for a composite-global message type that includes two json parts and a binary part:

{"id":1,"data":"sample","message":"part one message"}
{"message":"another part","customer":"Awesome Amalgamated, Ltd."}
0xDEEA0934DF23A37780934...

AMPS constructs the following set of XPath identifiers and values:

Identifier Value
/id 1
/data "sample"
/message ["part one message", "another part "]
/customer "Awesome Amalgamated, Ltd."

Table 16.5: Composite-global message identifiers

In short, when using composite-global, AMPS combines the parsable parts of the message into a single global set of XPath values, and ignores any part of the message that cannot be parsed.

composite-local

When using the composite-local message type, AMPS creates a distinct set of XPath identifiers for each part of the message. AMPS adds an XPath step with the position of the message part at the beginning of the identifier. Message types that do not support content filtering do not provide XPath identifiers, and AMPS skips over them.

For example, consider the message below for a composite-local message type that includes two json parts and a binary part:

{"id":1,"data":"sample","message":"part one message"}
{"message":"another part","customer":"Awesome Amalgamated, Ltd."}
0xDEEA0934DF23A37780934...

AMPS constructs the following set of XPath identifiers and values:

Identifier Value
/0/id 1
/0/data "sample"
/0/message "part one message"
/1/message "another part"
/1/customer "Awesome Amalgamated, Ltd."

Table 16.6: Composite-local message identifiers

In short, when using composite-local, AMPS creates XPath identifiers for each part of the message, using the position of the message part within the composite as the first part of the identifier. AMPS skips over any part of the message that cannot be parsed, and simply produces no values for that part of the message.

Choosing A Composite Type

To choose which composite type best fits your application, consider the following factors:

  • If you need to use delta messaging with this message type, use composite-local.
  • If there may be redundant field names in the parts of the message, and it is important to be able to filter based on which part contains the field, use composite-local.
  • If you need to be able to create views of this type, use composite-local.

Otherwise, composite-global may be easier and more straightforward for client filtering, since clients do not need to know the detailed structure of the message type to be able to filter on the message.

Protobuf Message Types

Protocol buffers, or protobufs for short, is an efficient, automated mechanism for serializing structured data. AMPS supports Google protobuf messages (version 2 and version 3) as a message format.

Because Google protocol buffers use a fixed format for messages, to use protobuf, you must configure AMPS with the definition of the messages AMPS will process. This involves defining a MessageType. You must define a MessageType for AMPS to be able to parse protobuf messages.

60East recommends that the .proto files used with AMPS explicitly declare the protocol buffer syntax version used. If there is no explicit declaration, AMPS assumes the file uses protocol buffer 2 syntax.

The AMPS engine is message-type agnostic. Except for the limitations described in this section, there is no difference to the AMPS engine between message types that use protocol buffers and other message types such as JSON or XML or FIX.

Configuring Protobuf Message Types

To use a protobuf message, you must first edit the configuration file to include a new MessageType. Then, specify the path to the protobuf file and the name of the protobuf file itself inside the MessageType. Below is a sample configuration of a protobuf message type:

...

<MessageType>
    <Name>my-protobuf-messages</Name>
    <Module>protobuf</Module>
    <ProtoPath>proto-archive;/mnt/shared/protofiles</ProtoPath>
    <ProtoFile>proto-archive/person.proto</ProtoFile>
    <Type>MyNamespace.Message</Type>
</MessageType>

...

Each message type references a ProtoFile, and specifies a single top-level type from the file. The ProtoFile may include other files through the standard protocol buffer include mechanism. Likewise, the top-level type may be any valid protocol buffer definition, including definitions that contain other types.

Once the protocol buffer MessageType is created as described above, you must either create a Transport that specifies that message type exactly, or you must create a Transport that can accept any known message type and ensure that the client specifies the new message type (in the example case, my-protobuf-message) in the connect string.

When creating a protobuf message type, you must provide the following parameters:

Parameter Description
Name
The name of the new, customized message type. The rest of the configuration file will use this name to refer to the message type.
Module
The module that contains the message type. Use protobuf for protocol buffer messages.
ProtoPath

The path in which to search for .proto files. The content of this element has the following syntax:

alias ; full-path

The alias provides a short identifier to use when searching for .proto files. The full path is the path that is substituted for that identifier.

For example, in the sample above, proto-archive is an alias for /mnt/shared/protofiles.

A configuration may omit the alias, and simply provide the path. For example:

;/mnt/repository/protodefs

You may specify any number of ProtoPath declarations.

ProtoFile
The name of the .proto file to use for this message type. To use an alias, prefix the name of the file with the alias, as shown in the example above.
Type
The name of the type inside the .proto file to use for this message type. AMPS requires a single type.

Table 16.7: protobuf Message Type Parameters

Filtering with Protobuf Messages

To filter protobuf messages, there are a couple of conventions you must remember. AMPS XPath identifiers begin at the outermost message, so you can simply use member names for that message. If you have nested messages, you use the name of the nested message and the member name when creating an XPath identifier.

For example, suppose you have the following definition in a .proto file:

message person {
    required string name = 1;
    required int32 personID = 2;
}

To access the personID data member, you simply use the name of the data member as the XPath identifier. An example filter that verifies that a personID is greater than 1000 would be:

/personID > 1000

If you have nested messages, you simply provide the path to the nested message you want to access.

Let’s assume that that the person message from the above example was nested inside another message with the name of record. The example filter below shows how to access the nested person message, and then filter to the personID:

/person/personID > 1000

In this case, the first part of the identifier (/person) specifies the submessage. The second part of the identifier (/personID) specifies the field within that submessage. Notice that, as always, there is no need to specify the name of the message for the outermost message.

Working With Multiple Protocol Buffer Types

Some applications require messages of different types: for example, an inventory management system may work with customer records, inventory records, and shippping order records.

When using protocol buffers, each of these messages would use a different .proto file, and therefore would be a different message type. Unlike a self-describing format such as JSON or XML, the serialized form of a protocol buffer message type does not automatically contain any information about the type of message or the fields that the message contains. Therefore, each protocol buffer message type is best considered as a completely distinct type. For example, the parser created for an order record and the parser created for a customer record are different. Unlike self-describing formats, it is not possible to use a single parser for these types, or for a parser to correctly handle a previously-unknown message structure.

There are two approaches to working with multiple protocol buffer types in an AMPS application:

  • Keep the message types distinct. Each message type requires a separate connection to AMPS. The advantage of this approach is that the .proto files can be maintained and updated separately. Each connection has a distinct type, and only needs to handle messages of that type. The disadvantage of this approach is that the application must make a connection to AMPS for each type of message received.
  • Create a “container” type that can optionally contain any of the needed message types. The advantage of this approach is that this requires only a single connection to AMPS. Since there is a single “container” type, a topic can hold this “container” type and have heterogenous actual contents. The disadvantage to this approach is that it requires a consumer to understand the “container” type, and changes to the contained types may need to be carefully managed across the consumers that use the container. A “container” type is typically a oneof of the contained types.

For example, you might define a container as follows:

message Container {

    oneof {
      Order      order_type = 1;
      Payment    payment_type = 2;
    }
}

message Order {
    required string customer_id = 1;
    ...
}

message Payment {
    required string customer_id = 1;
    ...
}

In this case, the container type will include either an Order or a Payment.

Union Types

When using a protocol buffer message type that contains a union, you can navigate the union using the names defined in the top-level element. For example, given the union defined below:

message MyUnion {
    optional Order      order_type = 1;
    optional Payment    payment_type = 2;
}

message Order {
    required string customer_id = 1;
    ...
}

message Payment {
    required string customer_id = 1;
    ...
}

Providing a filter of /order_type IS NOT NULL will return all of the MyUnion messages that contain an Order, while providing a filter of /payment_type/customer_id = '42' will return only the MyUnion messages that contain a Payment message with a customer_id of 42.

Limitations of the protobuf message type

Because the protobuf message type requires a specific, fixed definition for messages, AMPS does not support operations that construct messages that may contain arbitrary values. In particular, protobuf does not support:

  • Creating a View with a protobuf type as the MessageType. AMPS allows you to aggregate protobuf messages and project the results as another type, but the destination MessageType for a View cannot be a protobuf message type.
  • Creating an aggregated subscription for a topic that contains messages of a protobuf message type.
  • Subscriptions to AMPS internal topics. Protobuf message types do not support creating messages for AMPS internal topics, such as /AMPS/ClientStatus.
  • Enriching or preprocessing protobuf message types. AMPS does not support enrichment or preprocessing of protobuf messages.

Protocol buffer version 3 messages provide fixed default values for omitted fields. This means that there is no reliable way for AMPS to determine if a missing field has been intentionally left out of the message, or simply contains the fixed default value. The result is an additional limitation for protocol buffer version 3 message types:

  • Protocol buffer version 3 message types do not support delta publish or delta subscribe.

Protocol buffer version 2 message types can require that specific fields are provided in a message (that is, fields can be marked required). The result is an additional limitation for protocol buffer version 2 message types:

  • Protocol buffer version 2 message types do not support providing a subset of fields in a message by specifying a select list.

There are no other limitations in working with protocol buffer message types.

Working with Optional Default Values

Google protocol buffers provide the ability for a message to have fields that are both optional, so they need not be provided in the serialized message, and defaulted, so that there is a specific value interpreted when there is no value provided.

When no value is provided in the serialized message for an optional default value, AMPS interprets the message differently depending on the context:

  • For most uses, AMPS interprets the message as though the value is present and set to the default value. This means that you can filter on optional default values, use them as SOW keys, and aggregate optional default values regardless of whether a value is present in the serialized message.
  • For delta messaging with protocol buffer version 2, AMPS treats an optional default value as though there is no value present. AMPS does not provide the default value. This means that a delta update must provide the default value explicitly in the serialized message to set the field to the default value. This also means that, if the value present in the message is not the default value, but was not changed on the current update, AMPS will not emit that value in messages to delta subscribers. (Since delta messaging is not supported with protocol buffer version 3, this issue does not arise with that version.)

Loading Additional Message Types

AMPS includes the ability to load custom message types in external modules. As with all AMPS modules, custom message types are compiled into shared object files. AMPS dynamically loads these message types on startup, using the information provided in the configuration file. Once you have loaded and declared those types, you can use the type just as you use the default message types.

For example, the configuration below creates a message type named custom-type that uses a module named libmy-type-module.so and specifies a transport for messages of that type:

<Modules>
    <Module>
        <!-- Specifies the name to use to refer to this module in the rest of the
            configuration file -->
        <Name>custom-type-module</Name>

        <!-- Path to the library to load for this module. In this example, the path is a
            relative path below the directory where AMPS is started. -->
        <Library>./custom-modules/libmy-type-module.so</Library>
    </Module>
</Modules>

<MessageTypes>
    <MessageType>
        <!-- The name to use for this message type in the rest of the configuration file. -->
        <Name>custom-type</Name>

        <!-- Reference to the module that implements this message type, using the Name
            defined in the Module configuration. -->
        <Module>custom-type-module</Module>
    </MessageType>
</MessageTypes>

<Transports>
    <Transport>
        <Name>custom-type-tcp</Name>
        <Type>tcp</Type>
        <InetAddr>9008</InetAddr>

        <!-- The message type that this transport uses, using the Name defined in the
            MessageType configuration. -->
        <MessageType>custom-type</MessageType>
        <Protocol>amps</Protocol>
    </Transport>
</Transports>

Once a message type has been declared, you can use it in exactly the same way you use the default message types.

Notice, however, that custom-developed message types may only provide support for a subset of the features of AMPS. For example, the binary message type provided with AMPS does not support features that require AMPS to parse or construct a message, as described above. The developer of the message type must provide information on what capabilities the message type provides.