Skip to content

feat(bfd): add BFD server and peer state machine#3402

Closed
Ivan-Pokhabov wants to merge 3 commits into
osrg:masterfrom
Ivan-Pokhabov:feat/bfd-server
Closed

feat(bfd): add BFD server and peer state machine#3402
Ivan-Pokhabov wants to merge 3 commits into
osrg:masterfrom
Ivan-Pokhabov:feat/bfd-server

Conversation

@Ivan-Pokhabov
Copy link
Copy Markdown
Contributor

Implements RFC 5880/5881 BFD liveness detection for BGP peers:

  • UDP server listening on port 3784 with per-peer session management
  • State machine: Down → Init → Up with detection timeout and hard reset
  • Per-peer async counters and session state
  • Integration with BGP peer lifecycle (AddPeer/DeletePeer/ResetPeer)
  • Tests covering state transitions, peer reset, and BGP integration

  Implements RFC 5880/5881 BFD liveness detection for BGP peers:
  - UDP server listening on port 3784 with per-peer session management
  - State machine: Down → Init → Up with detection timeout and hard reset
  - Per-peer async counters and session state
  - Integration with BGP peer lifecycle (AddPeer/DeletePeer/ResetPeer)
  - Tests covering state transitions, peer reset, and BGP integration

Co-Authored-By: Ivan-Pokhabov <vanek3372@gmail.com>
Copy link
Copy Markdown
Member

@fujita fujita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You add some public API functions. They can be called by user applications that use GoBGP as a library. Are they safely called by multiple goroutines? Also functions that could be blocked need to take context.

Comment thread pkg/server/grpc_server.go Outdated
Prefix: p.Prefix,
RD: p.Rd,
LookupOption: apiutil.LookupOptionFromAPI(p.Type),
LookupOption: apiutil.LookupOption(p.Type),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't modify unrelated code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread pkg/server/bfd_server.go
s.eventConfig <- &config
}

func (s *bfdServer) Stop() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this called?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, we lose it

Comment thread pkg/server/bfd_server.go

s.eventGetPeer <- list

select {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this code could be blocked. The caller of GetPeerState holds s.shared.mu so you can't.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread pkg/server/bfd_server.go Outdated
return nil
}

func (s *bfdServer) GetPeerState(ctx context.Context, peerAddress string) (*bfdPeerState, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use netip.Addr

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace all string with netip.Addr

Comment thread pkg/server/bfd_server.go Outdated
}

func (s *bfdServer) DeletePeer(peerAddress string) error {
address, err := convertToIPAddress(peerAddress)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use netip.Addr

Comment thread pkg/server/bfd_server.go
}

func (s *bfdServer) AddPeer(peerAddress string, config oc.BfdConfig) error {
if !config.Enabled {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use netip.Addr

@Ivan-Pokhabov
Copy link
Copy Markdown
Contributor Author

You add some public API functions. They can be called by user applications that use GoBGP as a library. Are they safely called by multiple goroutines? Also functions that could be blocked need to take context.

Thank you for review, I fixed mistakes and made code more concurent-safety

Comment thread pkg/server/bfd_peer.go Outdated
}

func (p *bfdPeer) Stop() {
close(p.eventShutdown)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to wait for the coroutine to exit here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Comment thread pkg/server/bfd_peer.go Outdated

switch h.State {
case bfd.StateDown:
p.sendPacket(bfd.StateInit, false, false, h.MyDiscriminator)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the current state is Up, needs to set the sate Down?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, made like more RFC

Comment thread pkg/server/bfd_peer.go Outdated
peerState: ps,
logger: logger,
peerAddress: peerAddress,
peerPort: int(config.Port),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to use the default port number if config.Port is zero?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, made this

  - BgpServer.Stop() now calls bfdServer.Stop(), fixing goroutine leak
    where bfdServer.loop() ran indefinitely after server shutdown
  - Start/AddPeer/DeletePeer now accept context.Context and use select
    with ctx.Done() and eventShutdown, making them safe to call
    concurrently and non-blockable under cancellation or shutdown
  - GetPeerState/GetPeerStateList no longer route through the event loop:
    they read directly under peersMutex/atomic, eliminating the risk of
    deadlock when called while s.shared.mu is held (e.g. from toConfig)
  - Removed bfdEventGetPeerState and bfdEventGetPeerStateList types and
    their event loop cases — no longer needed
  - Fixed IPv4-mapped address mismatch: rxPacket now calls Addr().Unmap()
    so ReadFromUDP's ::ffff:1.2.3.4 form matches peers stored as pure IPv4
  - AddPeer/DeletePeer API changed from string to netip.Addr, eliminating
    redundant string parsing in the hot path
    - Wait for bfdPeer loop exit in Stop() and stop all peer timers during
      shutdown, so callers do not return while peer goroutine is still active
    - Treat remote BFD StateDown as a session failure when local state is Up,
      resetting the BGP peer immediately instead of waiting for timeout
    - Default per-peer BFD destination port to 3784 when BfdConfig.Port is zero
    - Aligned the peer state machine closer to RFC 5880: remote Down from Up now
      transitions local state to Down without immediately sending Init, local
      Init is represented explicitly, and remote Up no longer brings a Down
      session directly to Up
    - Add regression tests for peer shutdown, remote down handling, and default
      port selection
@fujita
Copy link
Copy Markdown
Member

fujita commented May 25, 2026

Pushed, thanks. Could we add documentation for configuring BFD?

@fujita fujita closed this May 25, 2026
@fujita fujita mentioned this pull request May 25, 2026
@Ivan-Pokhabov
Copy link
Copy Markdown
Contributor Author

Pushed, thanks. Could we add documentation for configuring BFD?

Thanks! I wiil do docs soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants