Edge Core AsyncAPI — v0.2.0¶
Event schema reference for all lifecycle events published by Edge Admin.
Interactive viewer: /asyncdoc on a running admin. Raw spec: GET /api/asyncapi.
Overview¶
Edge Admin publishes lifecycle events to a configured message broker (NATS, Kafka/Redpanda, AMQP 0-9-1 (RabbitMQ-compatible), Redis, MQTT, AWS SNS, or Google Cloud Pub/Sub). All events follow the CloudEvents 1.0 spec. Edge Admin publishes and forgets — it has no knowledge of consumers.
Event Envelope¶
Every event is wrapped in a CloudEvents envelope:
{
"specversion": "1.0",
"id": "550e8400-e29b-41d4-a716-446655440000",
"source": "https://github.com/wenet-ec/edge-core",
"type": "edge.node.registered",
"time": "2026-04-14T10:00:00Z",
"datacontenttype": "application/json",
"corename": "prod-us",
"data": { ... }
}
| Field | Description |
|---|---|
specversion |
Always "1.0" |
id |
UUID v4 — unique per publish. Useful for exactly-once delivery dedup (broker retries). Not useful for semantic dedup of node.status_changed duplicates — those carry different ids. Use (node_id, previous_status, status, time) for that. |
source |
Always "https://github.com/wenet-ec/edge-core" |
type |
Event type — doubles as NATS subject, RabbitMQ routing key, Redis channel, and MQTT topic (with . rewritten to / for MQTT — see tables below) |
time |
When the state change happened in admin (ISO 8601) |
datacontenttype |
Always "application/json" |
corename |
CloudEvents extension. Identifies the publishing core instance. Set via CORE_NAME env var (default: "default") |
data |
Full object snapshot at moment of event (see schemas below) |
Subjects / Topics¶
NATS¶
The type value is also the NATS subject. Subscription examples:
edge.node.> ← all node events
edge.node.status_changed ← only status transitions (server-side filter)
edge.command_execution.completed ← only completed executions
edge.> ← everything
JetStream mode (EVENT_BROKER_NATS_JETSTREAM=true): durable streams (one per domain) are auto-created on startup:
Stream: EDGE_NODES_EVENTS captures: edge.node.> + edge.enrollment_key.>
Stream: EDGE_COMMANDS_EVENTS captures: edge.command_execution.>
Stream: EDGE_SELF_UPDATES_EVENTS captures: edge.self_update_request.>
Stream: EDGE_SSH_EVENTS captures: edge.ssh_username.>
Retention is configured on the NATS server, not by Edge Core.
Pub/sub mode (default): messages are delivered to active subscribers only — no persistence. Missed messages are gone.
Server version floor: pub/sub mode works against any NATS server (core PUB stable since 1.x); JetStream mode requires NATS 2.2+ (July 2021). The adapter uses only baseline stream-create fields (name, subjects, storage), no version-gated extensions.
Kafka / Redpanda¶
Four topics, one per domain:
| Topic | Partition key |
|---|---|
edge-nodes-events |
node_id (or enrollment_key_id for enrollment events) |
edge-commands-events |
command_execution_id |
edge-self-updates-events |
self_update_request_id |
edge-ssh-events |
node_id (verifications partition by the node attempting auth) |
Partition key ensures ordering per entity, parallel across entities. Filter by event type using the type field in the envelope.
Server version floor: Kafka 0.10+ (released Feb 2017) and any wire-compatible broker (Redpanda, Confluent Cloud, Aiven, AWS MSK, Azure Event Hubs, Upstash, etc.). The brod client auto-negotiates the per-API protocol version on connect.
AMQP 0-9-1 (RabbitMQ-compatible)¶
Adapter id: amqp091 (alias: rabbitmq). Works against any AMQP 0-9-1 broker — RabbitMQ, LavinMQ, AmazonMQ for RabbitMQ, CloudAMQP.
All events are published to a single durable topic exchange: edge.events. The routing key is the event type (e.g. edge.node.registered).
Binding examples:
edge.node.* ← all node events
edge.node.status_changed ← only status transitions
edge.command_execution.# ← all command execution events
edge.# ← everything
Consumer queue durability is the consumer's choice — bind a durable queue to persist messages across restarts, or a transient queue for live-only consumption. Edge Core publishes with persistent: true (messages written to disk before broker ACKs).
Redis¶
Channel = event type (e.g. edge.node.registered). Subscribe using SUBSCRIBE for exact channels or PSUBSCRIBE for wildcard patterns:
SUBSCRIBE edge.node.registered ← exact channel
PSUBSCRIBE edge.node.* ← all node events
PSUBSCRIBE edge.* ← everything
No persistence or replay — messages are delivered to currently connected subscribers only. If no subscriber is connected when Core publishes, the message is gone. Pick Redis only when consumers are always-on and loss is acceptable.
Server version floor: Redis 2.0+ (Aug 2010, when Pub/Sub was introduced) and any wire-compatible server (Valkey, KeyDB, Dragonfly). The adapter uses only PING and PUBLISH over RESP2 — no version-gated commands. ACL-style URL usernames (redis://user:pass@host) and native TLS require Redis 6.0+ (Apr 2020); password-only URLs work against any version.
MQTT¶
Topic = event type with . rewritten to / so MQTT segment wildcards (+, #) work as expected:
| Event type | MQTT topic |
|---|---|
edge.node.registered |
edge/node/registered |
edge.node.status_changed |
edge/node/status_changed |
edge.command_execution.completed |
edge/command_execution/completed |
| ... | ... |
Subscription examples:
edge/node/+ ← all node events
edge/node/status_changed ← only status transitions
edge/command_execution/# ← all command execution events
edge/# ← everything
Default publish QoS is 1 (at-least-once with broker ACK). Configurable globally via EVENT_BROKER_MQTT_QOS=0|1|2 — there is no per-event QoS. Consumers should dedup on envelope id regardless (multi-admin setups produce duplicate node.status_changed events from independent health checkers).
The adapter publishes as MQTT 3.1.1 (proto_ver: :v4) — the lowest common denominator every modern broker accepts. v5 brokers (EMQX, HiveMQ, Mosquitto 2.x, etc.) downgrade our publisher session to 3.1.1 transparently while operators run v5 features (session expiry, shared subs, retained-message TTL) on their subscribers and broker-side as usual. We don't use any v5-only publish features (user properties, content-type, response topics), so 3.1.1 on the way out is free. Targets capped at 3.1.1 (older AWS IoT Core accounts, LTS Mosquitto, embedded brokers) work without configuration.
Durability is the broker's and consumer's concern. MQTT QoS controls only the publisher↔broker↔subscriber delivery handshake — it does not make messages durable. Subscribers wanting offline queueing connect with Session Expiry Interval > 0 on their own connection.
AWS SNS¶
Four SNS topics by domain — must be pre-provisioned in your AWS account, ARNs derived from EVENT_BROKER_AWS_SNS_TOPIC_ARN_PREFIX:
| Domain | Topic name suffix |
|---|---|
| Node + enrollment key events | edge-nodes-events |
| Command execution events | edge-commands-events |
| Self-update events | edge-self-updates-events |
| SSH events | edge-ssh-events |
SNS has no topic-name wildcards. Subscribers filter via subscription filter policies matched against message attributes. The adapter promotes two attributes on every publish:
The body remains the full CloudEvents envelope JSON regardless — body and attributes carry the same routing fields, so consumers reading the body don't need to know about attributes.
Filter policy examples:
{"type": [{"prefix": "edge.node."}]} // all node events
{"type": ["edge.command_execution.completed"]} // only completed executions
{"corename": ["prod-us"]} // only this core
{"type": [{"anything-but": "edge.command_execution.expired"}]} // exclude expired
Durability is the subscriber's concern. SNS itself doesn't store messages. Subscribers buy durability by being SQS queues (the standard SNS+SQS fan-out pattern), Lambda functions, or HTTPS endpoints with their own retention.
Google Cloud Pub/Sub¶
Four Pub/Sub topics by domain — must be pre-provisioned in your GCP project. The adapter constructs full resource names from EVENT_BROKER_GOOGLE_PUBSUB_PROJECT (+ optional EVENT_BROKER_GOOGLE_PUBSUB_TOPIC_ID_PREFIX):
| Domain | Topic ID |
|---|---|
| Node + enrollment key events | edge-nodes-events |
| Command execution events | edge-commands-events |
| Self-update events | edge-self-updates-events |
| SSH events | edge-ssh-events |
Pub/Sub has no topic-name wildcards. Subscribers filter via subscription filter expressions matched against message attributes. The adapter promotes two attributes on every publish:
The body is the CloudEvents envelope JSON, base64-encoded inside the Pub/Sub data field (the wire format requires base64; client libraries auto-decode for subscribers).
Filter expression examples (Pub/Sub filtering syntax):
hasPrefix(attributes.type, "edge.node.") # all node events
attributes.type = "edge.command_execution.completed" # only completed executions
attributes.corename = "prod-us" # only this core
NOT (attributes.type = "edge.command_execution.expired") # exclude expired
attributes.corename = "prod-us" AND hasPrefix(attributes.type, "edge.node.")
Filters are set on subscription creation and cannot be changed later — to change a filter, recreate the subscription.
Durability is built into the subscription. Pub/Sub buffers messages per subscription (default 7-day retention, max 31) until the subscriber ACKs them. This is closer to SNS+SQS combined than pure SNS — durability is on by default once a subscription exists. If no subscription exists when Edge Core publishes, the message is dropped (same as SNS without subscribers).
Event Types¶
Node Events¶
All node events share the same data shape unless noted.
| Type | NATS subject / RabbitMQ routing key | Description |
|---|---|---|
edge.node.registered |
edge.node.registered |
First-time enrollment — new node_id seen for the first time |
edge.node.reregistered |
edge.node.reregistered |
Re-enrollment — existing node came back (reboot, redeploy, etc.) |
edge.node.version_changed |
edge.node.version_changed |
Fires alongside reregistered when reported version differs from stored |
edge.node.status_changed |
edge.node.status_changed |
Health transition: healthy ↔ unhealthy ↔ unreachable |
edge.node.cluster_changed |
edge.node.cluster_changed |
Node moved to a different cluster |
edge.node.update_triggered |
edge.node.update_triggered |
Self-update signal successfully sent to this node's Watchtower |
Node data schema:
{
"node_id": "abc-123",
"cluster_name": "prod",
"status": "healthy",
"version": "1.2.0",
"id_type": "hostname",
"http_port": 44000,
"ssh_port": 40022,
"host_metrics_port": 9100,
"wireguard_metrics_port": 9101,
"http_proxy_port": 44001,
"socks5_proxy_port": 44002,
"self_update_enabled": true,
"last_seen_at": "2026-04-14T10:00:00Z",
"inserted_at": "2026-04-14T09:00:00Z",
"updated_at": "2026-04-14T10:00:00Z"
}
Extra fields by event type:
| Event | Extra fields |
|---|---|
edge.node.status_changed |
"previous_status": "healthy" |
edge.node.version_changed |
"previous_version": "1.1.0" |
edge.node.cluster_changed |
"previous_cluster_name": "staging" |
edge.node.update_triggered |
"self_update_request_id": "<uuid>" |
Multi-admin note: edge.node.status_changed may fire from multiple admin instances for the same transition (health check runs on every admin independently). Dedup consumers by id.
Command Execution Events¶
All command execution events share the same data shape. output is always excluded — fetch via GET /api/v1/command_executions/:id if needed.
| Type | NATS subject | Description |
|---|---|---|
edge.command_execution.created |
edge.command_execution.created |
Execution record created and queued (status: pending) |
edge.command_execution.sent |
edge.command_execution.sent |
Admin delivered execution to agent, agent ACKed (status: sent) |
edge.command_execution.completed |
edge.command_execution.completed |
Agent reported result — exit_code populated, consumer decides pass/fail |
edge.command_execution.cancelled |
edge.command_execution.cancelled |
Explicit cancel or agent received SIGTERM (exit_code: 143) |
edge.command_execution.expired |
edge.command_execution.expired |
Swept as stale before running (status: expired) |
edge.command_execution.pruned |
edge.command_execution.pruned |
Reaped by background pruning worker — only async deletion path |
Command execution data schema:
{
"command_execution_id": "cmdexec-abc123",
"command_id": "cmd-xyz789",
"node_id": "node-abc123",
"cluster_name": "prod",
"command_text": "systemctl restart app",
"timeout": 30000,
"status": "completed",
"exit_code": 0,
"target_all": false,
"expired_at": null,
"sent_at": "2026-04-14T10:00:01Z",
"completed_at": "2026-04-14T10:00:03Z",
"cancelled_at": null,
"inserted_at": "2026-04-14T10:00:00Z",
"updated_at": "2026-04-14T10:00:03Z"
}
Notes:
timeoutis in milliseconds (e.g.30000= 30s)exit_codeisnulluntilcompletedorcancelled;143on SIGTERM cancel- Null fields are included explicitly as
null, not omitted outputis never included — unbounded size
State per event:
| Event | status |
exit_code |
sent_at |
completed_at |
cancelled_at |
|---|---|---|---|---|---|
created |
pending |
null |
null |
null |
null |
sent |
sent |
null |
populated | null |
null |
completed |
completed |
integer | populated | populated | null |
cancelled |
cancelled |
143 or int |
populated or null | null |
populated |
expired |
expired |
null |
null |
null |
null |
pruned |
terminal at deletion | as recorded | as recorded | as recorded | as recorded |
pruned semantics: fired by the background pruning worker when a finalised execution is deleted from the DB after the retention window. The snapshot reflects the row's terminal state at deletion time (whatever completed/cancelled/expired left it in). pruned is the only async deletion path — cascade-from-command-delete is sync and does not fire events.
Self-Update Request Events¶
| Type | NATS subject / RabbitMQ routing key | Description |
|---|---|---|
edge.self_update_request.completed |
edge.self_update_request.completed |
Batch finished — summary populated |
Self-update request data schema:
{
"self_update_request_id": "selfupd-abc123",
"status": "completed",
"targeting": {
"type": "clusters",
"cluster_filters": {},
"node_filters": { "version": "1.1.*" }
},
"summary": {
"total": 10,
"triggered": 9,
"failed": 1
},
"inserted_at": "2026-04-14T10:00:00Z",
"updated_at": "2026-04-14T10:00:05Z"
}
Notes:
summaryis populated when the batch completestargeting.typeis one of"all","nodes","clusters"
Enrollment Key Events¶
| Type | NATS subject / RabbitMQ routing key | Description |
|---|---|---|
edge.enrollment_key.verified |
edge.enrollment_key.verified |
Agent attempted to enroll using a key (success or failure) |
Enrollment key verified data schema:
{
"enrollment_key_id": "enrkey-abc123",
"cluster_name": "prod",
"name": "prod rollout",
"uses_remaining": 4,
"result": "verified",
"verified_at": "2026-04-14T10:00:00Z"
}
Notes:
resultis one of"verified","invalid_key","key_expired","key_spent","node_limit_reached"- On
"invalid_key"the agent presented a blob that does not match any DB row —enrollment_key_id,cluster_name,name, anduses_remainingare allnull. The event still fires (failed enrollment attempts are real security signal — port scanners, credential stuffing, expired-key rotation surfaces here) nameis an optional human-readable label for the key (display only).nullwhen the key was created without one- The actual key blob is never included in the event — it is a credential
uses_remainingreflects state after this attempt; for unlimited keys it isnull
SSH Username Events¶
| Type | NATS subject / RabbitMQ routing key | Description |
|---|---|---|
edge.ssh_username.verified |
edge.ssh_username.verified |
Agent verified an SSH credential against admin (success/failure) |
SSH username verified data schema:
{
"ssh_username_id": "sshuser-abc123",
"node_id": "node-abc123",
"cluster_name": "prod",
"username": "deploy",
"auth_method": "public_key",
"result": "success",
"verified_at": "2026-04-14T10:00:00Z"
}
Notes:
auth_methodis one of"password","public_key","unknown"(unknownwhen the username doesn't exist for the node — admin can't tell which method the agent attempted)resultis one of"success","failure"usernameis what the agent attempted — populated even when the username doesn't exist for that node- When no SSH username row matches the
(node_id, username)pair,ssh_username_idandcluster_namearenull. The event still fires — failed attempts against missing usernames are real SIEM signal (brute-force, credential stuffing patterns) - Password hashes and public-key strings are never included in the event
Schema Principles¶
- Every event carries a full object snapshot in
data— same fields regardless of event type. Consumers read what they need, ignore the rest. - Transition events add
previous_*fields alongside the snapshot — the previous value cannot be derived from the snapshot alone. - Internal/secret fields never appear:
api_token,proxy_password,netmaker_host_id. - Null fields are always included explicitly as
null, never omitted.
Semantics¶
Edge Core publishes accurately regardless of broker. Durability, replay, and retention are the broker's and consumer's responsibility.
- NATS JetStream / Kafka — durable append-only log. Consumers can replay from an offset (JetStream consumer position, Kafka consumer group offset). Multiple independent consumers at different positions.
- NATS pub/sub — fire-and-forget. Messages are delivered to active subscribers only; missed messages are gone.
- RabbitMQ — delivery semantics depend on consumer queue configuration. Durable queue = messages survive broker restart; transient queue = live-only. Core always publishes with
persistent: true. - Redis — pure pub/sub (
PUBLISH/SUBSCRIBE). No queue, no persistence, no replay. Messages go only to currently connected subscribers. If no subscriber is connected, the message is gone. - MQTT — pub/sub. QoS 0/1/2 governs only the delivery handshake, not durability. The broker itself doesn't retain history (no replay, no consumer offsets). Subscribers wanting offline queueing connect with persistent sessions; subscribers wanting last-message-on-topic semantics rely on broker retained messages (Edge Core does not publish with retain=true).
- AWS SNS — fan-out pub/sub. SNS itself stores nothing — once delivered to subscribers (or delivery is exhausted), the message is gone. Durability is the subscriber's responsibility: subscribe an SQS queue for replay (SQS retains up to 14 days), or accept fire-and-forget for Lambda/HTTPS subscribers. SNS retries delivery to its own subscribers on failure (with exponential backoff) but never holds messages for late subscribers.
- Google Cloud Pub/Sub — fan-out pub/sub with per-subscription buffering built in. Each subscription holds un-ACKed messages for its configured retention (default 7 days, max 31), redelivering until ACKed. Closer to SNS+SQS combined than pure SNS — subscribers get durability for free without standing up a separate queue. If no subscription exists at publish time, the message is dropped (same as SNS without subscribers).
In all cases: each publish is a full snapshot, not a diff. If events are missed, the next event is still self-contained.
Spec Files¶
| File | Description |
|---|---|
docs/admin-asyncapi-v0.2.0.md |
This document |
docs/admin-asyncapi-v0.2.0.json |
AsyncAPI 3.1.0 JSON spec (download from /api/asyncapi) |