Alpha state disclaimer
The specification defined below is in early development cycle and is subject to (potentially breaking) change.
Any (collection of) software that fulfils the attributes and behaviours of the specification defined in this document is considered a data store.
This aims of the specification are:
Attributes are observable properties we expect the data store to expose with no side effects. They can be fixed at cluster/node creation or they can change over time.
The expectation if for attributes to be cheap to lookup and not require connections outside of the target node.
Where noted attributes are optional and don’t have to be available. This is particularly important when data stores could provide this information but only by violating the expectations above, in which case optional attributes should be omitted.
Attribute | Description |
---|---|
Cluster ID | ID of the cluster, MUST be the same for all nodes in the same cluster. |
Node ID |
Unique ID of the node within the cluster. |
Agent Version |
Version of the agent process, in SemVer format. |
Data Store Node Version |
Version of the data store process for the node, in SemVer format. |
Node's Shard List | List of all the shards (data units, see the sharding behaviour) currently on the node. |
Node Status |
Each node in the cluster has a status attribute:
|
Shard ID |
For each shard managed by the node, a cluster-unique ID is provided for the shard. All nodes in the cluster refer to the same shard with the same cluster-unique ID. |
Shard replication state |
Each independent shard managed by the node has its own replication state attribute. Each shard can be in one of the following states:
|
Shard commit offset |
As changes to the data are applied to shards a marker for the shard should be updated. This marker is the commit offset and represents the last persisted write operation on the shard. A commit offset can be any monotonically incremented signed integer. The only time a commit offset can decrease is in the presence of a lost write such as:
In any case a lost write indicates that a client-successful write is lost. This excludes cases such as transaction rollbacks due to constraint violation as these writes are not reported as successful to the clients. Examples of commit offsets are:
When shard commit offsets are reported a unit (seconds, commit number, …) should be reported along side. It is expected that the unit of a shard commit offset does not change. This attribute is optional. |
Shard replication lag |
For non-primary shards on the node the replication lag attribute SHOULD be provided if it can be looked up or computed efficiently and without requiring access to other nodes. Replication lag is a signed 64 bits integer that represents the gap between the latest change applied to the shard and the latest change applied on the shard’s primary. For example this could be:
The unit of measurement for the replication lag MUST be reported where the replication lag is. For example this could be:
This attribute is optional. |
Agent Action Invocation Records |
Agent Actions, described below in the behaviours, run on the nodes. An invocation record is created every time an action runs on an agent to track its progress. Invocation records have the following properties specified on creation:
Invocation records have the following properties added by agents:
|
Node Attributes Maps |
Nodes MUST expose custom attributes maps: one for information known only when the store process is running and another for information available even without the store process. These maps allow agent implementations to expose arbitrary named attributes. These attributes can be used in Replicante Core to implement logic, match nodes and more. The attribute names are mapped to values of selected types:
Attributes should be scoped to ensure different implementations don’t clash.
The Examples:
|
Behaviours are things we expect the data store to perform during operation to fulfil its duties. These cover both overall architectural expectations (such as replication and sharding) as well as more specific actions (such as cluster initialisation and node management).
Behaviour | Description |
---|---|
Clustering |
Data stores MUST support clustering:
It is worth noting there is no requirement for nodes to match each other. This means heterogeneous clusters (nodes with different configuration or software) are supported. |
Dynamic Nodes Membership |
Data stores MUST support dynamically adding and removing nodes from a cluster without interrupting operations on every node currently in the cluster. Implementation of dynamic node membership is data store specific but it MUST respect the properties below. New node provisioning: When a new node is provisioned it will need to join a cluster before it is fully initialised. A node can join a cluster in one of the following ways:
Existing node deprovisioning: When a current member of the cluster is deprovisioned it must be forgotten by the cluster it was a member of. Nodes may be terminated unexpectedly as a result of error or deprovisioned while network partitions prevent them from communicating with the rest of the cluster. For these reasons all clusters MUST provide an agent action to remove and forget a node. The action will run on an existing node in the cluster and MUST be idempotent: a cluster may be asked to remove a node that is not part of it, in which case the action does nothing. |
Replication |
The data store MUST support data replication across nodes. Data units, as defined by the sharding behaviour, are replicated independently of each other. Each data unit MUST have at least one primary node during regular operations. Additional replication states for data units are defined in the attributes section. |
Sharding of Data |
The data store MUST organise data into one or more independent units called shards. Shard independence from other shards means:
It is valid to treat data stores that do not implement sharding as data stores with one shard only. |
Shards Automated Failover |
A failover is an operation where the node currently holding a shard’s PRIMARY stops acting as such and nodes holding SECONDARY copies of the same shard select a the new PRIMARY to replace it. Data stores MUST automatically detect issues with nodes holding PRIMARY shards and perform a failover operation to a SECONDARY for each shard. Data store SHOULD provide an administrative command to perform a voluntary failover. |
Agent Actions |
Agent actions are the execution unit on which any automation is built. The agent/data store node is responsible for tracking and executing these actions. Agents allow clients to schedule as many actions as they like and can start rejecting actions if too many actions are schedule and have not been processed yet. Agents MUST execute only one action at a time. Actions MUST be executed in the order they have been successfully scheduled with the agent. Agents can provide implementations for any actions they choose on top of any action required by this specification document. Agents SHOULD document the actions they provide, their arguments and outputs. |
Some behaviours are expected through specific agent actions. These enable building automation:
Action | Description |
---|---|
Action: Cluster Initialisation
agent.replicante.io/cluster.init |
Initialise an uninitialised cluster. What this means exactly is dependent on the data store. Action arguments: refer to each data store/agent documentation. Action final state: refer to each data store/agent documentation. Maybe Required: this action is required for data stores that require an explicit cluster initialisation step. |
Action: Add Node
agent.replicante.io/cluster.add |
Add a new node into an existing cluster. This action is run on a node that is already part of the cluster to add another node to it. Action arguments: refer to each data store/agent documentation. Action final state: refer to each data store/agent documentation. Maybe Required: this action is required for data stores where new nodes are added to clusters from existing nodes. |
Action: Join Cluster
agent.replicante.io/cluster.join |
Have a new node join an existing cluster. This action is run on a new node that is not part of the cluster and will add itself to it. Action arguments: refer to each data store/agent documentation. Action final state: refer to each data store/agent documentation. Maybe Required: this action is required for data stores where new nodes add themselves to existing clusters. |
Action: Remove Node
agent.replicante.io/cluster.remove |
Remove a node from an existing cluster. This action is run on a node still in a cluster to remove and forget another node. The node to remove may have already been terminated and/or removed from the cluster when the action is called and therefore MUST be idempotent. Action arguments: refer to each data store/agent documentation. Action final state: refer to each data store/agent documentation. Maybe Required: this action is required for data stores where terminated nodes are explicitly removed from the cluster. |