Skip to content

Edge Core AsyncAPI — v0.2.0

Event schema reference for all lifecycle events published by Edge Admin.

Interactive viewer: /asyncdoc on a running admin. Raw spec: GET /api/asyncapi.


Overview

Edge Admin publishes lifecycle events to a configured message broker (NATS, Kafka/Redpanda, AMQP 0-9-1 (RabbitMQ-compatible), Redis, MQTT, AWS SNS, or Google Cloud Pub/Sub). All events follow the CloudEvents 1.0 spec. Edge Admin publishes and forgets — it has no knowledge of consumers.

Event Envelope

Every event is wrapped in a CloudEvents envelope:

{
  "specversion": "1.0",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "source": "https://github.com/wenet-ec/edge-core",
  "type": "edge.node.registered",
  "time": "2026-04-14T10:00:00Z",
  "datacontenttype": "application/json",
  "corename": "prod-us",
  "data": { ... }
}
Field Description
specversion Always "1.0"
id UUID v4 — unique per publish. Useful for exactly-once delivery dedup (broker retries). Not useful for semantic dedup of node.status_changed duplicates — those carry different ids. Use (node_id, previous_status, status, time) for that.
source Always "https://github.com/wenet-ec/edge-core"
type Event type — doubles as NATS subject, RabbitMQ routing key, Redis channel, and MQTT topic (with . rewritten to / for MQTT — see tables below)
time When the state change happened in admin (ISO 8601)
datacontenttype Always "application/json"
corename CloudEvents extension. Identifies the publishing core instance. Set via CORE_NAME env var (default: "default")
data Full object snapshot at moment of event (see schemas below)

Subjects / Topics

NATS

The type value is also the NATS subject. Subscription examples:

edge.node.>                       ← all node events
edge.node.status_changed          ← only status transitions (server-side filter)
edge.command_execution.completed  ← only completed executions
edge.>                            ← everything

JetStream mode (EVENT_BROKER_NATS_JETSTREAM=true): durable streams (one per domain) are auto-created on startup:

Stream: EDGE_NODES_EVENTS          captures: edge.node.> + edge.enrollment_key.>
Stream: EDGE_COMMANDS_EVENTS       captures: edge.command_execution.>
Stream: EDGE_SELF_UPDATES_EVENTS   captures: edge.self_update_request.>
Stream: EDGE_SSH_EVENTS            captures: edge.ssh_username.>

Retention is configured on the NATS server, not by Edge Core.

Pub/sub mode (default): messages are delivered to active subscribers only — no persistence. Missed messages are gone.

Server version floor: pub/sub mode works against any NATS server (core PUB stable since 1.x); JetStream mode requires NATS 2.2+ (July 2021). The adapter uses only baseline stream-create fields (name, subjects, storage), no version-gated extensions.

Kafka / Redpanda

Four topics, one per domain:

Topic Partition key
edge-nodes-events node_id (or enrollment_key_id for enrollment events)
edge-commands-events command_execution_id
edge-self-updates-events self_update_request_id
edge-ssh-events node_id (verifications partition by the node attempting auth)

Partition key ensures ordering per entity, parallel across entities. Filter by event type using the type field in the envelope.

Server version floor: Kafka 0.10+ (released Feb 2017) and any wire-compatible broker (Redpanda, Confluent Cloud, Aiven, AWS MSK, Azure Event Hubs, Upstash, etc.). The brod client auto-negotiates the per-API protocol version on connect.

AMQP 0-9-1 (RabbitMQ-compatible)

Adapter id: amqp091 (alias: rabbitmq). Works against any AMQP 0-9-1 broker — RabbitMQ, LavinMQ, AmazonMQ for RabbitMQ, CloudAMQP.

All events are published to a single durable topic exchange: edge.events. The routing key is the event type (e.g. edge.node.registered).

Binding examples:

edge.node.*                ← all node events
edge.node.status_changed   ← only status transitions
edge.command_execution.#   ← all command execution events
edge.#                     ← everything

Consumer queue durability is the consumer's choice — bind a durable queue to persist messages across restarts, or a transient queue for live-only consumption. Edge Core publishes with persistent: true (messages written to disk before broker ACKs).

Redis

Channel = event type (e.g. edge.node.registered). Subscribe using SUBSCRIBE for exact channels or PSUBSCRIBE for wildcard patterns:

SUBSCRIBE edge.node.registered       ← exact channel
PSUBSCRIBE edge.node.*               ← all node events
PSUBSCRIBE edge.*                    ← everything

No persistence or replay — messages are delivered to currently connected subscribers only. If no subscriber is connected when Core publishes, the message is gone. Pick Redis only when consumers are always-on and loss is acceptable.

Server version floor: Redis 2.0+ (Aug 2010, when Pub/Sub was introduced) and any wire-compatible server (Valkey, KeyDB, Dragonfly). The adapter uses only PING and PUBLISH over RESP2 — no version-gated commands. ACL-style URL usernames (redis://user:pass@host) and native TLS require Redis 6.0+ (Apr 2020); password-only URLs work against any version.

MQTT

Topic = event type with . rewritten to / so MQTT segment wildcards (+, #) work as expected:

Event type MQTT topic
edge.node.registered edge/node/registered
edge.node.status_changed edge/node/status_changed
edge.command_execution.completed edge/command_execution/completed
... ...

Subscription examples:

edge/node/+                ← all node events
edge/node/status_changed   ← only status transitions
edge/command_execution/#   ← all command execution events
edge/#                     ← everything

Default publish QoS is 1 (at-least-once with broker ACK). Configurable globally via EVENT_BROKER_MQTT_QOS=0|1|2 — there is no per-event QoS. Consumers should dedup on envelope id regardless (multi-admin setups produce duplicate node.status_changed events from independent health checkers).

The adapter publishes as MQTT 3.1.1 (proto_ver: :v4) — the lowest common denominator every modern broker accepts. v5 brokers (EMQX, HiveMQ, Mosquitto 2.x, etc.) downgrade our publisher session to 3.1.1 transparently while operators run v5 features (session expiry, shared subs, retained-message TTL) on their subscribers and broker-side as usual. We don't use any v5-only publish features (user properties, content-type, response topics), so 3.1.1 on the way out is free. Targets capped at 3.1.1 (older AWS IoT Core accounts, LTS Mosquitto, embedded brokers) work without configuration.

Durability is the broker's and consumer's concern. MQTT QoS controls only the publisher↔broker↔subscriber delivery handshake — it does not make messages durable. Subscribers wanting offline queueing connect with Session Expiry Interval > 0 on their own connection.

AWS SNS

Four SNS topics by domain — must be pre-provisioned in your AWS account, ARNs derived from EVENT_BROKER_AWS_SNS_TOPIC_ARN_PREFIX:

Domain Topic name suffix
Node + enrollment key events edge-nodes-events
Command execution events edge-commands-events
Self-update events edge-self-updates-events
SSH events edge-ssh-events

SNS has no topic-name wildcards. Subscribers filter via subscription filter policies matched against message attributes. The adapter promotes two attributes on every publish:

type      = "edge.node.status_changed"
corename  = "prod-us"

The body remains the full CloudEvents envelope JSON regardless — body and attributes carry the same routing fields, so consumers reading the body don't need to know about attributes.

Filter policy examples:

{"type": [{"prefix": "edge.node."}]}                       // all node events
{"type": ["edge.command_execution.completed"]}             // only completed executions
{"corename": ["prod-us"]}                                  // only this core
{"type": [{"anything-but": "edge.command_execution.expired"}]}  // exclude expired

Durability is the subscriber's concern. SNS itself doesn't store messages. Subscribers buy durability by being SQS queues (the standard SNS+SQS fan-out pattern), Lambda functions, or HTTPS endpoints with their own retention.

Google Cloud Pub/Sub

Four Pub/Sub topics by domain — must be pre-provisioned in your GCP project. The adapter constructs full resource names from EVENT_BROKER_GOOGLE_PUBSUB_PROJECT (+ optional EVENT_BROKER_GOOGLE_PUBSUB_TOPIC_ID_PREFIX):

Domain Topic ID
Node + enrollment key events edge-nodes-events
Command execution events edge-commands-events
Self-update events edge-self-updates-events
SSH events edge-ssh-events

Pub/Sub has no topic-name wildcards. Subscribers filter via subscription filter expressions matched against message attributes. The adapter promotes two attributes on every publish:

type      = "edge.node.status_changed"
corename  = "prod-us"

The body is the CloudEvents envelope JSON, base64-encoded inside the Pub/Sub data field (the wire format requires base64; client libraries auto-decode for subscribers).

Filter expression examples (Pub/Sub filtering syntax):

hasPrefix(attributes.type, "edge.node.")                                      # all node events
attributes.type = "edge.command_execution.completed"                          # only completed executions
attributes.corename = "prod-us"                                               # only this core
NOT (attributes.type = "edge.command_execution.expired")                      # exclude expired
attributes.corename = "prod-us" AND hasPrefix(attributes.type, "edge.node.")

Filters are set on subscription creation and cannot be changed later — to change a filter, recreate the subscription.

Durability is built into the subscription. Pub/Sub buffers messages per subscription (default 7-day retention, max 31) until the subscriber ACKs them. This is closer to SNS+SQS combined than pure SNS — durability is on by default once a subscription exists. If no subscription exists when Edge Core publishes, the message is dropped (same as SNS without subscribers).


Event Types

Node Events

All node events share the same data shape unless noted.

Type NATS subject / RabbitMQ routing key Description
edge.node.registered edge.node.registered First-time enrollment — new node_id seen for the first time
edge.node.reregistered edge.node.reregistered Re-enrollment — existing node came back (reboot, redeploy, etc.)
edge.node.version_changed edge.node.version_changed Fires alongside reregistered when reported version differs from stored
edge.node.status_changed edge.node.status_changed Health transition: healthyunhealthyunreachable
edge.node.cluster_changed edge.node.cluster_changed Node moved to a different cluster
edge.node.update_triggered edge.node.update_triggered Self-update signal successfully sent to this node's Watchtower

Node data schema:

{
  "node_id": "abc-123",
  "cluster_name": "prod",
  "status": "healthy",
  "version": "1.2.0",
  "id_type": "hostname",
  "http_port": 44000,
  "ssh_port": 40022,
  "host_metrics_port": 9100,
  "wireguard_metrics_port": 9101,
  "http_proxy_port": 44001,
  "socks5_proxy_port": 44002,
  "self_update_enabled": true,
  "last_seen_at": "2026-04-14T10:00:00Z",
  "inserted_at": "2026-04-14T09:00:00Z",
  "updated_at": "2026-04-14T10:00:00Z"
}

Extra fields by event type:

Event Extra fields
edge.node.status_changed "previous_status": "healthy"
edge.node.version_changed "previous_version": "1.1.0"
edge.node.cluster_changed "previous_cluster_name": "staging"
edge.node.update_triggered "self_update_request_id": "<uuid>"

Multi-admin note: edge.node.status_changed may fire from multiple admin instances for the same transition (health check runs on every admin independently). Dedup consumers by id.


Command Execution Events

All command execution events share the same data shape. output is always excluded — fetch via GET /api/v1/command_executions/:id if needed.

Type NATS subject Description
edge.command_execution.created edge.command_execution.created Execution record created and queued (status: pending)
edge.command_execution.sent edge.command_execution.sent Admin delivered execution to agent, agent ACKed (status: sent)
edge.command_execution.completed edge.command_execution.completed Agent reported result — exit_code populated, consumer decides pass/fail
edge.command_execution.cancelled edge.command_execution.cancelled Explicit cancel or agent received SIGTERM (exit_code: 143)
edge.command_execution.expired edge.command_execution.expired Swept as stale before running (status: expired)
edge.command_execution.pruned edge.command_execution.pruned Reaped by background pruning worker — only async deletion path

Command execution data schema:

{
  "command_execution_id": "cmdexec-abc123",
  "command_id": "cmd-xyz789",
  "node_id": "node-abc123",
  "cluster_name": "prod",
  "command_text": "systemctl restart app",
  "timeout": 30000,
  "status": "completed",
  "exit_code": 0,
  "target_all": false,
  "expired_at": null,
  "sent_at": "2026-04-14T10:00:01Z",
  "completed_at": "2026-04-14T10:00:03Z",
  "cancelled_at": null,
  "inserted_at": "2026-04-14T10:00:00Z",
  "updated_at": "2026-04-14T10:00:03Z"
}

Notes:

  • timeout is in milliseconds (e.g. 30000 = 30s)
  • exit_code is null until completed or cancelled; 143 on SIGTERM cancel
  • Null fields are included explicitly as null, not omitted
  • output is never included — unbounded size

State per event:

Event status exit_code sent_at completed_at cancelled_at
created pending null null null null
sent sent null populated null null
completed completed integer populated populated null
cancelled cancelled 143 or int populated or null null populated
expired expired null null null null
pruned terminal at deletion as recorded as recorded as recorded as recorded

pruned semantics: fired by the background pruning worker when a finalised execution is deleted from the DB after the retention window. The snapshot reflects the row's terminal state at deletion time (whatever completed/cancelled/expired left it in). pruned is the only async deletion path — cascade-from-command-delete is sync and does not fire events.


Self-Update Request Events

Type NATS subject / RabbitMQ routing key Description
edge.self_update_request.completed edge.self_update_request.completed Batch finished — summary populated

Self-update request data schema:

{
  "self_update_request_id": "selfupd-abc123",
  "status": "completed",
  "targeting": {
    "type": "clusters",
    "cluster_filters": {},
    "node_filters": { "version": "1.1.*" }
  },
  "summary": {
    "total": 10,
    "triggered": 9,
    "failed": 1
  },
  "inserted_at": "2026-04-14T10:00:00Z",
  "updated_at": "2026-04-14T10:00:05Z"
}

Notes:

  • summary is populated when the batch completes
  • targeting.type is one of "all", "nodes", "clusters"

Enrollment Key Events

Type NATS subject / RabbitMQ routing key Description
edge.enrollment_key.verified edge.enrollment_key.verified Agent attempted to enroll using a key (success or failure)

Enrollment key verified data schema:

{
  "enrollment_key_id": "enrkey-abc123",
  "cluster_name": "prod",
  "name": "prod rollout",
  "uses_remaining": 4,
  "result": "verified",
  "verified_at": "2026-04-14T10:00:00Z"
}

Notes:

  • result is one of "verified", "invalid_key", "key_expired", "key_spent", "node_limit_reached"
  • On "invalid_key" the agent presented a blob that does not match any DB row — enrollment_key_id, cluster_name, name, and uses_remaining are all null. The event still fires (failed enrollment attempts are real security signal — port scanners, credential stuffing, expired-key rotation surfaces here)
  • name is an optional human-readable label for the key (display only). null when the key was created without one
  • The actual key blob is never included in the event — it is a credential
  • uses_remaining reflects state after this attempt; for unlimited keys it is null

SSH Username Events

Type NATS subject / RabbitMQ routing key Description
edge.ssh_username.verified edge.ssh_username.verified Agent verified an SSH credential against admin (success/failure)

SSH username verified data schema:

{
  "ssh_username_id": "sshuser-abc123",
  "node_id": "node-abc123",
  "cluster_name": "prod",
  "username": "deploy",
  "auth_method": "public_key",
  "result": "success",
  "verified_at": "2026-04-14T10:00:00Z"
}

Notes:

  • auth_method is one of "password", "public_key", "unknown" (unknown when the username doesn't exist for the node — admin can't tell which method the agent attempted)
  • result is one of "success", "failure"
  • username is what the agent attempted — populated even when the username doesn't exist for that node
  • When no SSH username row matches the (node_id, username) pair, ssh_username_id and cluster_name are null. The event still fires — failed attempts against missing usernames are real SIEM signal (brute-force, credential stuffing patterns)
  • Password hashes and public-key strings are never included in the event

Schema Principles

  • Every event carries a full object snapshot in data — same fields regardless of event type. Consumers read what they need, ignore the rest.
  • Transition events add previous_* fields alongside the snapshot — the previous value cannot be derived from the snapshot alone.
  • Internal/secret fields never appear: api_token, proxy_password, netmaker_host_id.
  • Null fields are always included explicitly as null, never omitted.

Semantics

Edge Core publishes accurately regardless of broker. Durability, replay, and retention are the broker's and consumer's responsibility.

  • NATS JetStream / Kafka — durable append-only log. Consumers can replay from an offset (JetStream consumer position, Kafka consumer group offset). Multiple independent consumers at different positions.
  • NATS pub/sub — fire-and-forget. Messages are delivered to active subscribers only; missed messages are gone.
  • RabbitMQ — delivery semantics depend on consumer queue configuration. Durable queue = messages survive broker restart; transient queue = live-only. Core always publishes with persistent: true.
  • Redis — pure pub/sub (PUBLISH/SUBSCRIBE). No queue, no persistence, no replay. Messages go only to currently connected subscribers. If no subscriber is connected, the message is gone.
  • MQTT — pub/sub. QoS 0/1/2 governs only the delivery handshake, not durability. The broker itself doesn't retain history (no replay, no consumer offsets). Subscribers wanting offline queueing connect with persistent sessions; subscribers wanting last-message-on-topic semantics rely on broker retained messages (Edge Core does not publish with retain=true).
  • AWS SNS — fan-out pub/sub. SNS itself stores nothing — once delivered to subscribers (or delivery is exhausted), the message is gone. Durability is the subscriber's responsibility: subscribe an SQS queue for replay (SQS retains up to 14 days), or accept fire-and-forget for Lambda/HTTPS subscribers. SNS retries delivery to its own subscribers on failure (with exponential backoff) but never holds messages for late subscribers.
  • Google Cloud Pub/Sub — fan-out pub/sub with per-subscription buffering built in. Each subscription holds un-ACKed messages for its configured retention (default 7 days, max 31), redelivering until ACKed. Closer to SNS+SQS combined than pure SNS — subscribers get durability for free without standing up a separate queue. If no subscription exists at publish time, the message is dropped (same as SNS without subscribers).

In all cases: each publish is a full snapshot, not a diff. If events are missed, the next event is still self-contained.


Spec Files

File Description
docs/admin-asyncapi-v0.2.0.md This document
docs/admin-asyncapi-v0.2.0.json AsyncAPI 3.1.0 JSON spec (download from /api/asyncapi)