4. User-Defined Functions¶
What’s A UDF?¶
User defined functions allow you to extend the capability of AMPS expressions. Once you create and register a function, you can use the function in filters, enrichment, and view projections – any context where you can use a built-in AMPS non-aggregate function.
Functions receive values from a single message and return a single value. When used in a filter, the UDF is called once for each message evaluated for the filter. When used to project a field for view or a JOIN, the UDF is called each time that AMPS needs to construct that field – that is, once for each output message. (Notice that, in a JOIN, the UDF may receive fields from any number of underlying topics: however, AMPS constructs the joined set of fields for the output message before calling the UDF.)
You can use user-defined functions in any context where you can use built-in AMPS functions: in fact, AMPS provides many of the built-in functions through this interface.
When Should I Create a UDF?¶
Create a user defined function when:
- Your application requires functionality (filtering, transformation of values) that cannot be implemented using the AMPS built-in functions.
- You can perform that functionality efficiently in server-side code.
- You have control over your AMPS deployments and can ensure that the module that implements the UDF will be present in every instance where an application may require the UDF.
A user-defined function cannot be used to aggregate multiple messages into a single value. A UDF is called for a single message and produces a single result.
In current releases of AMPS, user-defined functions cannot produce arrays or compound values. You can return simple, scalar values from a user-defined function, or use the provided functions to return a string value.
Implementing a UDF¶
Notice that, unlike other AMPS module types, UDFs do not use a context. You implement a UDF as a function with C linkage and the following signature:
static void function_name(amps_expression_value result ,
unsigned long argcount,
amps_expression_value_arrayargs);
Your function processes the arguments provided in the args
parameter
and returns the value in the result
.
AMPS sets no particular requirements on the name of the function. The function name can be any valid C function name.
Working with Arguments¶
There are two steps to working with an argument provided to a UDF. First, you extract the value from the provided argument aray. Second, you convert the value to the type you need to use in your function.
The AMPS external API provides the amps_expression_value_get
function to retrieve a specific value from an
amps_expression_value_array
. The function takes the array to extract
the value from, and the position of the value to extract. For example,
the following code returns the first argument from an
amps_expression_value_array
named args
.
amps_expression_value myVal = amps_expression_value_get(args,0);
Once the value is extracted, the AMPS external API offers a set of
functions for retrieving an underlying C value for the
amps_expression_value
. For example, to retrieve a string value, you
provide a pointer and variable to hold the length, then call
amps_expression_value_as_string
to retrieve the string:
char* value = NULL;
size_t valueLen = 0;
// value will point to the beginning of the string,
// valueLen will specify the length of the string.
amps_expression_value_as_string(myVal, &value, &valueLen);
The AMPS external API provides similar functions for all of the types recognized by the AMPS expression engine. See Working with Expression Values for more information on expression values.
Setting the Return Value¶
The return value from your UDF is the amps_expression_value
provided
as the first argument when AMPS calls your function. To return a value
from your UDF, you use the AMPS external API to set the type and value
of the amps_expression_value
.
For example, the following line of code sets the return value to a
boolean
FALSE:
amps_expression_value_set_bool(result, 0);
The following line of code sets the return value to a double
the
value of the variable calculation
:
amps_expression_value_set_double(result, calculation);
The value and type of the amps_expression_value
provided to your
function at the point of return is the value and type of the return from
your UDF.
The AMPS external API provides a convenience function for setting an
output value to one of the input values. The
amps_expression_value_set_value
function simply sets the value of
the first argument to the value of the second argument. The function
handles any type provided as an input value. For example, the following
line of code sets the result to the first value of the input arguments
to the UDF:
amps_expression_value_set_value(result, amps_expression_value_get(args,0));
The named methods are provided for simple, scalar values. For strings, AMPS must manage the memory allocated for the string, as described in the next section.
Constructing Strings¶
AMPS provides a special set of functions for constructing strings within
a UDF. These functions enable AMPS to correctly manage the lifetime of
the memory allocated for the string. Notice that, because the lifetime
of the string is based on the lifetime of the results returned from the
function, you must not use the allocator provided in the
amps_module_init
function to allocate memory for strings returned
from a UDF.
There are two ways to create a string that can be returned from a UDF. Which method to use depends on the lifetime of the string you are returning.
String lifetime is guaranteed to exceed the evaluation of the full expression. If the string value that you are returning is guaranteed to be valid while AMPS uses the result of the UDF and is guaranteed to be properly freed afterwards if necessary, use
amps_expression_value_set_cstr
. For example, if your function translates a set of numeric codes to a fixed set of strings, you could use this function to return a pointer to a static string in your module. Likewise, if your function will simply return the value of one of the input parameters, you can use this function to set the output value to point to the string in the input parameter (since AMPS is already managing the lifetime of the input parameter).In this case, the return value structure does not take ownership of the string, and will not free the string when AMPS is done with the return value. For a static string, there is no need to free the string. For the contents of another parameter to the function, that parameter has ownership of the string and will free it when AMPS is finished with the function results.
Strings allocated on the stack as local variables are freed when the function returns, which means that the memory the return value references will be in an indeterminate state when AMPS evaluates the results of the expression. Use amps_expression_value_allocate_cstr
with strings allocated on the stack.
- Return value manages the string lifetime. For other cases, use
amps_expression_value_allocate_cstr
to allocate memory that will be owned by the return value, and will be freed when AMPS has no more need of the return value. This function returns a pointer to the beginning of the allocated memory, and you use that pointer to copy the string into the newly allocated memory.
The following table lists the string construction functions.
Function | |
---|---|
|
Sets the type of the provided
With this function, the
|
|
Sets the type of the
With this function, the
|
Table 4.1: String Construction Functions
Making UDF Functions Array-Aware¶
AMPS includes functions for working with array values passed into a UDF. By default, when a value provided to a UDF is an array, AMPS provides the first value in the array to the UDF. This is similar to the way that most existing functions in AMPS work.
AMPS also provides functions that you can use to make your function array-aware. Notice
that a UDF must return a single value as a return type: however, this capability can
be used to perform operations on array values. For example, you could implement an
ARRAY_LEN()
function to allow clients to write filters like ARRAY_LEN(/items) < 10
.
The following table lists the functions provided for working with arrays:
Function | |
---|---|
int
amps_expression_value_is_array(
amps_expression_value v);
|
Returns TRUE (non-zero) if the
provided amps_expression_value
contains an array, 0 otherwise. |
amps_expression_value_array
amps_expression_value_as_array(
amps_expression_value v,
size_t* outCount);
|
Retrieves the array from the provided
amps_expression_value and sets
outCount to the number of items
in the array. |
amps_expression_value
amps_expression_value_get(
amps_expression_value_array array,
size_t pos);
|
Retrieves the value at position
pos . The first item in the array
is at position 0. |
Table 4.2: Array Functions
Registering the UDF with AMPS¶
For AMPS to call your UDF, you must register the function with AMPS during module initialization. The AMPS utility API includes the following function for registering the module with AMPS:
Function | |
---|---|
amps_register_udf(
amps_udf_t udf,
amps_register_udf,
const char *name,
size_t paramcount
) |
Registers a UDF with AMPS.
The
For example, to set up a UDF that is
implemented in a function named
amps_register_udf(do_stuff,
"DO_STUFF",
1);
|
Table 4.3: Registering a UDF function