Actions are a way to describe some task that needs to be performed by some component of the larger system.
Actions progress from start to finish across states.
More details about the states an action goes through are documented in the
Developers Notebook.
Actions are scheduled by applying a YAML object using the replictl
tool
(yes, this intentionally mirrors kubernetes.io approach):
$ replictl apply -f path/to/action.yaml
Object applied successfully
The YAML object depends on what kind of action you are scheduling.
Node actions (also known as Agent actions) are defined in the specification and are executed on a specific nodes, generally by Replicante Agents.
The YAML object for agent actions has the following specification:
apiVersion: replicante.io/v0
kind: NodeAction
metadata:
# Can override the namespace with --namespace=test-namespace
namespace: default
# Can override the cluster with --cluster=test-cluster
cluster: target-cluster-id
# Can override the namespace with --node=test-node
node: target-node-id
spec:
# Trigger a debug action that executes two dummy steps and then successfully completes.
action: agent.replicante.io/debug.progress
# Pass additional arguments as structured data.
args:
options: 'available options change based on the action'
format: 'any structured YAML object is fine'
Node actions require approval for scheduling by default when they are applied.
This means that actions will NOT be scheduled until they are approved using replictl
or the API.
Requiring approval for an action means that an action can be created without
executing it and someone else can approve it after review.
The action executed on approval is exactly the one applied with no change allowed.
To approve and action with replictl
:
# Approve a node action for execution.
# It will be scheduled the next orchestration cycle for the cluster.
$ replictl action approve-node-action UUID
Action approved for scheduling
# Node actions can also be disapproved and thus cancelled.
$ replictl action disapprove-node-action UUID
Action disapproved and will not be scheduled
To skip the approval step and schedule an action as soon as possible after it is applied
you can set the approval
metadata attribute to granted
.
metadata:
# Don't require explicit approval before the action is scheduled.
approval: granted
Some times actions operate or impact multiple nodes or the full cluster. These actions generally are about orchestrating changes to the cluster or day to day operations. They don’t even need to be around orchestrating work but that is the most common case.
To support these use cases Replicante Core provides Orchestrator Actions. These are actions that are executed outside of the datastore they target and at the control plane level (either as part of Replicante Core or as a stateless service invoked by Core).
apiVersion: replicante.io/v0
kind: OrchestratorAction
metadata:
# Can override the namespace with --namespace=test-namespace
namespace: default
# Can override the cluster with --cluster=test-cluster
cluster: target-cluster-id
spec:
# Trigger a debug action that executes two dummy steps and then successfully completes.
action: core.replicante.io/debug.ping
# Pass additional arguments as structured data.
args:
options: 'available options change based on the action'
format: 'any structured YAML object is fine'
Orchestrator actions scheduling approval works the same way as node action scheduling approval does.
The only difference is the replictl
command (and API endpoint) used to approve actions:
# List orchestrator actions to know what needs to be approved still.
$ replictl action list-orchestrator-actions
CLUSTER ID ACTION ID KIND STATE CREATED FINISHED
dev-agent-zookeeper f3bab556-d25f-4d06-90e9-63a5793dd083 core.replicante.io/debug.ping PENDING_APPROVE 2022-06-19 11:34:45.256 UTC
# Approve an orchestrator action for execution.
# It will be scheduled the next orchestration cycle for the cluster.
$ replictl action approve-orchestrator-action UUID
Orchestrator action approved for scheduling
# Orchestrator actions can also be disapproved and thus cancelled.
$ replictl action disapprove-orchestrator-action UUID
Orchestrator action disapproved and will not be scheduled
When Replicante Core schedules actions it follows defined rules around which actions can be scheduled, when and where.
The aim of actions is to change the state of the system. Running multiple actions at the same time is therefore risky as it means different changes possibly going into different directions. On the other hand many activities can be safely performed while other changes are happening.
Replicante Core defines a strict set of rules around action scheduling to ensure things behave as expected:
Rule 1 exists mainly for safety and simplicity:
As for orchestrator action scheduling modes: no, you can choose the mode. Scheduling modes are a property of actions and not action invocations. If a task is not safe to perform in parallel with others it is never safe to do so, not just sometimes.
The exception to this would be running actions in more restrictive modes, which may be supported in the future.
To enforce the above rules Replicante Core will schedule actions only when no higher priority action is waiting to be scheduled. Additionally, running actions are taken into account to decide if scheduling is allowed.
In the table below:
Node | Orchestrator (Exclusive) | Orchestrator (ClusterExclusiveNodeParallel) | |
---|---|---|---|
[Running] Node | X | ||
[Running] Orchestrator (Exclusive) | X | X | X |
[Running] Orchestrator (ClusterExclusiveNodeParallel) | X | X | |
[Pending] Node | X | ||
[Pending] Orchestrator (Exclusive) | X | ||
[Pending] Orchestrator (ClusterExclusiveNodeParallel) |
The ClusterExclusiveNodeParallel
orchestrator action mode is planned but not currently in use.