BGP & BFD Health
When BGP and BFD are enabled on a peering group, SecureLink monitors the health of each tunnel and routing session. Health status is displayed on the peering detail page for every member.
BGP Session Status
BGP (Border Gateway Protocol) manages dynamic route exchange between peering members. Each member's BGP session can be in one of the following states:
| State | Indicator | Description |
|---|---|---|
| Established | Green | BGP session is fully up and actively exchanging routes. This is the healthy state. |
| Active / Connect | Yellow | BGP is attempting to establish a session with the peer. This is normal during initial setup or after a disruption, but should not persist. |
| Idle | Red | BGP session is down. No route exchange is occurring. Requires investigation. |
BFD Session Status
BFD (Bidirectional Forwarding Detection) monitors tunnel health with sub-second intervals. Each tunnel has its own BFD session:
| State | Indicator | Description |
|---|---|---|
| Up | Green | BFD session is active and the tunnel is healthy. |
| Init | Yellow | BFD is in the process of establishing the session with the remote peer. |
| Down | Red | BFD has detected a tunnel failure. If BGP is enabled, failover will be triggered. |
In dual tunnel configurations, each member has two BFD sessions -- one for the primary tunnel and one for the secondary tunnel. This applies to both WireGuard and IPSec peerings.
Member Health Overview
Each member in the peering group shows an aggregate health status that summarizes all of its BGP and BFD sessions:
| Health | Meaning |
|---|---|
| Healthy | All BGP sessions are Established and all BFD sessions are Up. |
| Degraded | Some sessions are down but at least one path is operational. Traffic is still flowing. |
| Down | All sessions are down. No peering traffic is flowing for this member. |
A Degraded status in a dual tunnel configuration means one of the two tunnels is down. Traffic is flowing on the remaining tunnel, but redundancy has been lost. If the second tunnel also fails, connectivity between the affected sites will be interrupted. Investigate and resolve the failed tunnel promptly.
Viewing Health Status
The peering detail page displays health information in the Members table. Each row shows:
- BGP State: The current BGP session state for that member.
- BFD State: The current BFD session state (per tunnel in dual tunnel mode).
- Status: The overall member health (Healthy, Degraded, Down).
Click on a member row for more detailed session information, including per-peer and per-tunnel breakdowns.
Troubleshooting Unhealthy Peering
If a member shows Degraded or Down status, work through the following checks:
1. Verify the Edge Is Online
Check that the edge device is online and sending heartbeats. If the edge itself is offline, all peering sessions for that member will be down.
2. Check WAN Connectivity
Confirm that the WAN link on the edge is active and has network connectivity. If the edge has multiple WAN interfaces, verify the specific interface used for peering.
3. Verify Firewall Rules
Both WireGuard and IPSec peerings use UDP on the configured listen port (default 51820). Ensure that:
- The listen port is open for inbound UDP traffic on both edges.
- If dual tunnel is enabled, the secondary listen port is also open.
- Any intermediate firewalls or NAT devices allow the traffic.
4. Review Edge Logs
Check the edge logs for BGP or BFD error messages. Common issues include:
Common to both protocols:
- BGP hold timer expired: The remote peer is unreachable. Check network path.
- BFD session down: Tunnel packets are not reaching the remote end. Check WAN connectivity and firewall rules.
WireGuard-specific:
- WireGuard handshake timeout: The initial tunnel establishment failed. Verify that the remote edge is online and the port is reachable.
IPSec-specific:
- SA negotiation failed: Security Association could not be established. Try rekeying from the peering detail page (see IPSec Configuration).
- SA expired / SA Pending: The SA keys need to be refreshed. Trigger a manual rekey or verify the rekey interval is configured.
If all members in a peering group show Down status simultaneously, the issue is likely not with individual edges but with a shared network path, firewall policy change, or configuration error at the peering group level. Check for recent changes to firewall rules or network infrastructure that might affect all sites.