Config Sync Pipeline
This page describes the complete lifecycle of a configuration change — from a user clicking a button in the UI to the edge device applying the change and confirming it back to the orchestrator.
The Seven Steps
Every configuration change follows the same pipeline, regardless of whether it affects a dedicated edge, MTGE, or connector:
Step 1: Edit
An administrator makes a change in the Web UI — for example, adding a WireGuard peer, modifying an ACL rule, or updating an interface IP address. The API saves the new desired state to MySQL.
Step 2: Dirty Flag
The API marks the edge's configuration as "dirty" by updating its tracking record. This dirty flag is what causes the yellow pending indicator to appear in the UI. The flag includes a reason (e.g., "wireguard peer added") for debugging purposes.
Step 3: Batch Build
When a sync is triggered (manually or automatically), the API aggregates configuration from across 20+ database tables into a single batch message. This includes interfaces, WireGuard tunnels, routing, NAT rules, ACLs, service chain settings, and monitoring configuration.
The batch message contains:
- Sequence number — An incrementing counter that identifies this particular configuration version
- Config hash — A SHA-256 hash of the entire configuration, used for end-to-end verification
- Commands array — An ordered list of handler-specific configurations
Each command in the array specifies a topic (which handler should process it) and a body (the handler's configuration):
commands: [
{ index: 0, topic: "interface", body: { ... } },
{ index: 1, topic: "wireguard", body: { ... } },
{ index: 2, topic: "static", body: { ... } },
{ index: 3, topic: "nat44-...", body: { ... } },
{ index: 4, topic: "acl", body: { ... } },
...
]
Commands are ordered by dependency — interfaces before tunnels, tunnels before routing, routing before NAT, and so on.
Step 4: MQTT Delivery
The batch message is published to the device's MQTT topic. The topic varies by device type:
| Device Type | Topic |
|---|---|
| Dedicated Edge | VSR/{serial}/batch |
| MTGE (per tenant) | VSR/{serial}/batch/{tenantId} |
| Connector | VSR/{serial}/batch |
The EMQX broker delivers the message to the connected agent. If the agent is offline, the message is not queued — the agent will request its configuration when it next connects.
Step 5: Apply
The edge agent receives the batch and applies it. How this works depends on the device type:
Dedicated Edge and MTGE — The agent uses the V3 Sync Coordinator, which applies configuration in eight dependency-ordered phases:
Each phase waits for its dependencies to complete before starting. If a newer configuration arrives while a sync is in progress, the coordinator cancels the in-flight sync and applies the newer one instead.
Within each phase, handlers write desired state to etcd (for Ligato-managed resources) or call the VPP Binary API directly (for WireGuard, NAT, and other advanced features).
Connector — The connector agent applies configuration using Linux networking tools directly:
| Handler | Tool |
|---|---|
| WireGuard | wg CLI (WireGuard tools) |
| Static routes | ip route |
| NAT | iptables masquerade rules |
| ACL | iptables filter rules |
Step 6: Confirm
After applying the configuration, the agent sends a confirmation message back to the orchestrator over MQTT. The confirmation includes:
- Sequence number — Which configuration version was applied
- Config hash — The hash computed by the agent over the configuration it actually applied
- Status — Success or failure
- Applied commands — Count of successfully applied commands
- Failed commands — Count and details of any commands that failed
- VPP mode — Whether the edge is running in DPDK or AF_PACKET mode
Step 7: Hash Verification
The orchestrator compares the hash in the confirmation against the hash it calculated when building the batch. If they match, the configuration is marked as Synced and the dirty flag is cleared. If they do not match, the configuration is marked as Failed.
This end-to-end hash verification provides a cryptographic guarantee that the exact configuration defined in the UI is what is running on the device.
Dirty Flag Lifecycle
The dirty flag tracks the state of each device's configuration relative to the orchestrator's desired state:
| State | Meaning |
|---|---|
| Synced | The device is running the latest configuration. No pending changes. |
| Pending | Changes exist in the database that have not been pushed to the device. |
| Applying | A batch has been sent and the orchestrator is waiting for confirmation. |
| Failed | The device could not apply the configuration, or the hash did not match. |
Stale Configuration Recovery
When an edge reports a sequence number that is older than the orchestrator expects, it means the edge is running stale configuration. This can happen after a device reboot, network outage, or if a previous sync was lost.
The recovery process:
- The orchestrator detects the old sequence number in the confirmation message
- It waits 15 seconds to allow VPP to stabilize after a restart
- It re-pushes the latest batch configuration automatically
No manual intervention is required — stale configuration is self-healing.
Startup Config Request
When an edge agent starts up (after a reboot or container restart), it does not wait passively for configuration. Instead, it actively requests its configuration:
- The agent publishes a config request message
- The orchestrator receives it and pushes the latest batch configuration
- If the first request is not answered (e.g., the orchestrator is temporarily unreachable), the agent retries with exponential backoff
This ensures edges converge to the correct configuration as quickly as possible after any disruption.
Device-Specific Differences
MTGE (Multi-Tenant Gateway)
MTGEs manage configuration per tenant:
- Each tenant's config is published to a separate MQTT topic:
VSR/{serial}/batch/{tenantId} - Dirty flags are tracked per tenant, not per device — changing one tenant's config does not trigger a sync for other tenants
- The agent applies configuration within the correct VRF context for each tenant
Connector
Connectors have a simplified pipeline:
- Configuration state is embedded directly in the
connectorsdatabase table (no separate state tracking table) - Only four command topics:
wireguard,static,nat_config,acl_config - The agent uses Linux kernel networking instead of VPP — no etcd or Ligato layer
- Configuration is persisted to a local JSON file on the connector
Troubleshooting
Sync shows "Failed"
- Check device connectivity — Is the device online and reporting heartbeats? An offline device cannot receive configuration.
- Review edge logs — The agent logs will show which specific handler or command failed and why.
- Hash mismatch — If the hash does not match, it usually means the agent could not apply one or more commands. Look for partial application in the logs.
- Retry — After investigating, click Sync Now to reattempt.
Sync stuck on "Applying"
The orchestrator is waiting for a confirmation that never arrived. Possible causes:
- The MQTT connection was interrupted between the edge and broker
- The agent crashed during application
- The agent applied the config but the confirmation message was lost
In most cases, the next heartbeat or inform from the edge will reveal whether it is running the correct configuration. A manual sync retry will resolve the issue.
Edge running old configuration after reboot
This is normally handled automatically by the startup config request mechanism. If the edge still shows stale config:
- Verify the edge is connected to MQTT (check heartbeats in the monitoring dashboard)
- Trigger a manual sync from the edge detail page
- If the edge is not connecting to MQTT, check its certificates and network connectivity
Phase timeout during application
On dedicated edges, if a handler in the sync coordinator takes too long, the phase times out. Common causes:
- VPP is still starting up and the Binary API is not yet available
- etcd is temporarily unreachable
- A handler is blocked waiting for an external resource
Check the agent logs for the specific phase and handler that timed out.