- Configuration data schema
- Records auto-generation
- Application-wide data management
- Endpoints data synchronization
- Group-specific configuration
- Schema versioning
The Kaa configuration subsystem supplies endpoints with a structured data set of arbitrary complexity that is managed through the Kaa server. The fact that Kaa operates with structured data deserves a special emphasis. Knowledge of the application's data layout enables Kaa to deliver some useful features, such as:
- incremental data updates for the endpoints;
- endpoint-specific data view that is based on the endpoint groups membership;
- automatically generated data object model in the endpoint SDK;
- automatic generation of the default configuration;
- enforcement of the data integrity and validity on the server's northbound integration interface.
The structure of the data is determined by the schema that is configurable. It is the responsibility of the Kaa developer to construct the configuration data schema and make the client application interpret the data supplied by the endpoint library. Kaa administrator, in turn, can provision the schema into the Kaa server and configure the values accordingly.
Once the configuration data schema is loaded into the Kaa application, the Control server automatically assigns its a version number, generates derivative schemas (delta schema, base data schema, and override data schema), and populates the configuration in group "all" with the default values.
Configuration data schema
Configuration data schema is the original, framework user-defined specification of the data model that Kaa configuration subsystem would operate with. The format of the configuration data schema is similar to the profile schema and based on the Apache Avro schema. The Kaa configuration management subsystem supports all of the Avro primitive types: null, boolean, int, long, float, double, bytes, string, and most of the complex types: record, enum, array, union, and fixed. The Avro map type (a set of <key, value> pairs) is not currently supported. Also, it is possible to define an array of unions, but Kaa expects all of the entities to be of the same type.
See below examples that illustrate basic constructs, add-ons on top of Avro, and their use.
The root object type in Kaa is always a record.
- Name and namespace attributes are both mandatory for the "record" types. They are used for record type referencing in derivative data schemas in Kaa.
The optional field attribute (boolean, false by default) determines whether or not the field in the record is optional. Internally, Kaa translates optional fields into union fields with null type at the top of the list (which automatically makes them default to null - see the Records auto-generation section for more details).
In case of an optional union, Kaa internally automatically puts null at the top of the types list in the union definition.
The by_default parameter attribute (value interpreted according to the field type, does not have a default) determines a default value for the field that will be used when generating the default record. by_default must be present for all mandatory primitive record fields, except for null.
The following table specifies the by_default attribute format for every supported primitive type.
Type Format Example boolean true/false "by_default": true int numeric value from (-2^31+1) to (2^31-1) "by_default": 55 long numeric value from (-2^63+1) to (2^63-1) "by_default": 2147483648 float floating point value "by_default": 1.432 double floating point value "by_default": 1.432 bytes json array of byte values "by_default": [1, 2, 55, 254, 4] string simple string format "by_default": "abcdef"
The addressable record type attribute (boolean, true by default) determines whether the record supports partial updates (deltas). If true, Kaa automatically adds a UUID field (__uuid) to the record for addressing purpose when producing derivative schemas. For that reason, __uuid is a reserved field name in Kaa. The root record in the configuration schema ignores addressable = false.
The overrideStrategy array type parameter attribute (string, "replace" by default) determines how to merge arrays in the configuration across the endpoint groups. Possible values are "replace" and "append".
A field in Kaa configuration is addressable if the containing record is addressable. The field address is formed by appending the field name to the containing record address and using "/" as a separator. Root record's address is always "/".
The following fields are addressable in this schema:
However, the fields within the instances of org.kaaproject.sample.nestedRecordT contained by /arrayOfRecords, are not addressable, since it is not possible to address records within arrays in the current Kaa version.
Configuration data schema supplies sufficient metadata to enable Kaa to construct records populated with the default values. In order to do so, the record (sub-)schema is analyzed in the depth-first, top-to-bottom traversal order.
- Union type fields assume the first type listed in the union definition. The default value is generated according to the rules specific to the type encountered. NB: any optional fields (those having attribute "optional": true in the schema) are in fact unions with the first type null. Therefore, optional fields default to empty value.
- For a field of any primitive type (except for null), Kaa expects the by_default attribute to be present and supply the default field value. A schema missing such attribute for a mandatory primitive record field generates an exception and is rejected by Kaa.
- Non-optional record type fields are generated by applying the same record generation algorithm.
- Enum fields assume the first value listed in the type definition.
- Arrays are generated empty by default.
- Fixed type fields are generated filled in with zeros.
- __uuid fields of the type org.kaaproject.configuration.uuidT are generated with a valid UUID assigned to the value sub-field.
For example, the following record schema:
would default to the following record:
Kaa server caches the results of auto-generating the default records and re-uses them in the further operation.
Application-wide data management
Every endpoint in an application belongs to the group "all" with weight 0. This group contains the base data set for every configuration data schema version in a Kaa application. When a new schema version is loaded into the Kaa application, the server automatically transforms the data schema into the base data schema and generates the default configuration for the group "all" by applying the default records generation algorithm to the root record in the base data schema. This configuration may be later changed either via the Web UI, or through the integration interface. Kaa takes care of delivering the up-to-date configuration to all endpoints that support the corresponding data schema version.
Base data schema
The base data schema is obtained by transforming the configuration data schema. As a result of such transformation:
- optional fields are transformed into unions with null type at the top of the type list;
__uuid field of union type ["org.kaaproject.configuration.uuidT", "null"] is added to every record having addressable = true. In the example below org.kaaproject.configuration.uuidT type is declared in the nestedRecordT record type, and reused in rootT:
Updating the configuration
The Kaa Control server exposes API for loading configuration data into the group "all". The data set must be supplied in an Avro binary or Avro JSON format that would validate against the base data schema (of the version specified in the API call). It completely replaces the existing data (for the schema version) in the group "all".
When loading the new data set, Kaa persists the existing UUIDs for all addressable records and ignores the values supplied with the new configuration.
In case of unions that assumes that the record types in the old configuration and in the new configuration match. If they don't, Kaa generates a new UUID value for the replacement record.
In case of non-addressable records (those directly or indirectly contained in the arrays), Kaa performs record instances matching by comparing the UUID values in the old and the new configuration.
UUID validation in designed to prevent corruption of configuration data. UUID validation is performed when the API user / Kaa administrator loads configuration data to the server after updating the configuration data downloaded from the server,
Kaa inspects the UUIDs of the new version of the configuration and matches them with the UUIDs of the previous version of the configuration. UUID validation is performed in two cases: when an existing object is updated and when a new object is created.
Updating an existing object (there is a UUID in the previous version of the configuration):
The following situations are possible:
- If UUID remains the same, the new configuration is stored with the existing UUID,
- If UUID is null or unknown and the object is a record, Kaa performs a search through the data schema comparing the field hierarchy to detect the correct object and sets the previous UUID.
- If UUID is null or unknown and the object is an array, Kaa generates a new UUID.
Creating a new object (no UUID in previous version of the configuration):
- If UUID is null, a new UUID is generated for the object and inserted into the database.
- If UUID is equal to an UUID already used or unknown value, a new UUID is generated.
Endpoints data synchronization
Once the Operations server has calculated the up-to-date configuration for the endpoint, it constructs an incremental delta update based on the knowledge of the prior configuration applied to that endpoint. In certain cases the delta update can contain a complete endpoint configuration. Having received the delta update, the endpoint merges it with the existing configuration data, notifies the client application about the update, and persists the resultant configuration. The data storage location is abstracted in the endpoint and is defined in the client implementation.
The Kaa server monitors the configuration updates and optimizes the data volume that needs to be delivered to the endpoints by calculating configuration deltas. Data consistency is ensured by the hash comparison between the endpoint and the server.
The Protocol schema determines the structure of the data updates that Kaa server sends to the endpoints. It is automatically generated by the Control server from the data schema in the process of loading. The Protocol schema determines the set of operations allowed against the data already known to the endpoint, which make it current. The updates may either carry a full set of data (full sync), or a partial update (configuration delta). Full sync is delivered with a protocol schema part that is very similar to the base data schema.
The schema generator performs the following transformations to convert the data schema to a protocol schema:
- __uuid field of org.kaaproject.configuration.uuidT type is added to every addressable record. Having a UUID ensures that the record becomes addressable and a Kaa server can send partial updates against such records.
Optional fields are transformed to unions with the original field type, null and org.kaaproject.configuration.unchangedT types. (*TODO* is this only for optional or for any unions with null?) The latter is a simple enum that is introduced to indicate no changes in the parameter value from the one already known to the endpoint. It has the following structure:
- Mandatory fields are transformed into unions with the original field type and org.kaaproject.configuration.unchangedT.
- Array fields are transformed into unions:
- an array of:
- either the original array items' type to add items to the array;
- or org.kaaproject.configuration.uuidT if the original item type is an UUID-addressable record;
- org.kaaproject.configuration.unchangedT to indicate no changes to the array contents;
org.kaaproject.configuration.resetT types to purge the array contents. Similarly to org.kaaproject.configuration.unchangedT, org.kaaproject.configuration.resetT is a simple enum:
- an array of:
The root record schema is wrapped into an array of org.kaaproject.configuration.deltaT records. The only field of the org.kaaproject.configuration.deltaT, delta is a union of the transformed data root record schema, and every UUID-addressable record type encountered in the root record schema. Such structure allows encoding delta updates to every UUID-addressable record in the schema, including the root record.
For example, observe transformation of the following record:
|Data schema||Protocol schema (transformed)|
- how __uuid was added to the org.kaaproject.sample.addressableRecordT, but not org.kaaproject.sample.primitiveRecordT;
- reference to org.kaaproject.sample.addressableRecordT in the union definition for the delta field of the org.kaaproject.configuration.deltaT type;
- org.kaaproject.configuration.uuidT is used in a union to indicate removal of UUID-addressable records in /arrayOfRecords, but not in /arrayOfPrimitives.
Delta calculation algorithm
Delta calculation algorithm is executed by the Operations server to discover and represent with the protocol schema the difference between the latest known to the EP configuration, and the up to date one. Both configurations conform to the same data schema version.
In order to know the configuration currently provisioned into the EP, Kaa server maintains a hash lookup table that contains the configuration SHA-1 hash and the corresponding configuration. Whenever ESP calculates a configuration with a unique hash, it adds it to the table which is persisted in the database and shared across the Operations servers. The EP submits the hash of the last known configuration to Operations server for the table lookup.
Having found the currently known to EP configuration via a hash lookup, Operations server starts with the root record and recursively, field by field, looks for the discrepancies between the two configuration sets. Once it finds a difference, it looks up the nearest UUID-addressable record in the addressing hierarchy and creates a delta record for it. Having done so, the algorithm proceeds with comparing the corresponding records in the two data sets, and filling in the delta record. When the fields are identical, it uses unchanged delta schema field value to indicate so. Otherwise a value from the newer data set is copied into the delta record.
Such traversal works for all addressable fields in the configuration. However, items in the arrays are not addressable and Kaa employs special handling logic for calculating deltas for the array fields.
- If the array items are UUID-addressable, the calculator looks for items with matching UUIDs in the two configuration sets and creates a separate delta for such objects. If the new configuration does not have some of the records, their UUIDs are added to the "remove" list for this field in the protocol schema. If all of the old records are absent in the new configuration, instead of listing all of UUIDs in the "remove" list, the calculator uses the "reset" value instead. The records that did not exist in the configuration known to the endpoint are added to the delta message for appending to the array on the client side.
- If array items are not UUID-addressable and the calculator detects any difference between the old and new array contents, the old contents gets reset, and the new sent together with the configuration delta.
- If array items have different types in the old and new configurations, the server resets the array first and then populates another delta message with the most recent items.
For example consider the following schema:
Operations server calculates the delta between the two configuration sets:
|The last known to the endpoint (old)||The latest on the server (new)|
Compared to the endpoint configuration, the server has:
- "testField5" reset to the default value;
- the element with UUID [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3] has a new value for the "testField4" field;
- the element with UUID [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1] was removed;
- a new element with UUID [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4] was added.
The calculated delta update is an array with three entries:
The first delta item is a partial update for the object with UUID [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3], where "testField4" field was changed from "3" to "36". The second one is update for the "testField3" field to remove the item with UUID [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1] (Note: "testField1" and "testField5" were set to "unchanged"). The last one changes "testField5" to null and adds new array item with UUID [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4].
Kaa supports configuration data customization that is based on the endpoint groups membership. The data schema for the customized data remains the same. In order to adjust the data for specific endpoints, Kaa allows assigning configuration overrides to the endpoint groups. Therefore, the overrides only apply to the endpoints that belong to the group in question. Kaa Operations server transparently to the endpoints applies the overrides to the data in group "all" before serving the resulting data to the endpoints.
The override data structure is governed by a dedicated derivative schema called the override data schema.
Override data schema
Override data schema determines the structure of the data that allows changing arbitrary fields in the data that conforms to the base schema. Override data schema is derivative to the configuration data schema and very much similar to the base data schema. Compared to the base data schema, while constructing an override schema, Kaa adds org.kaaproject.configuration.unchangedT to every field type definition (converting them into union types, if necessary). Thus, all mandatory fields become union types with their original type and org.kaaproject.configuration.unchangedT. When org.kaaproject.configuration.unchangedT appears in the data, it indicates no parameter value override compared to the base data.
For example, a mandatory field
|In the data schema||In the override schema (transformed)|
A mandatory field:
Loading the override data into the endpoint groups is similar to loading base data into the group "all". Among the other parameters, the group ID must be passed to the API call to indicate the group the data is intended for. The loading algorithm processes the record UUID values identically to how it is done for group "all", persisting the UUID values that already existed in the previous version of the data in the processed group. Loading is done for the group in question independently of any other groups and without any cross-group data lookups.
In order to construct the configuration view for the endpoint, the Operations server evaluates group membership according to the endpoint profile. Then it merges all the configurations assigned to the groups the endpoint belongs to, starting with the one that has the lowest weight (which is always group "all" with 0 weight). Any overlapping field values are replaced with the ones from the group with the higher weight.
In case of the arrays, the overrideStrategy field in the configuration schema defines the way in which the arrays are merged.
Record UUID fields never change from the values in the lowest weight group they were first encountered in.
Both profile schema and configuration schema versions the endpoint supports are taken into account during the merge.
In the current version of Kaa, updates to the configuration schema require updates of the client application.
In order to enable server backwards compatibility with the older clients as the data schema evolves, Kaa servers may maintain more than one configuration schema version. A new, sequential version number is automatically assigned to every data schema loaded into a Kaa server. Configuration data is managed independently for every schema version. The endpoints report their supported configuration data schema version in the service profile. Kaa server, in response, serves them with the data updates that correspond to the reported version.