docs: add known issue for management bridge MAC change after NetworkManager migration#1049
Open
tillo wants to merge 2 commits into
Open
docs: add known issue for management bridge MAC change after NetworkManager migration#1049tillo wants to merge 2 commits into
tillo wants to merge 2 commits into
Conversation
…anager migration The wicked-to-NetworkManager migration drops the bond post-up hook that pinned the management bridge (mgmt-br) MAC address to the bond MAC. Without it, mgmt-br behaves as an ordinary Linux bridge and floats its MAC to the lowest-MAC enslaved port; bridge-mode VM veths can therefore change the bridge MAC on VM lifecycle events. MetalLB's L2 ARP responder caches the announce-interface MAC and then advertises a stale value, black-holing LoadBalancer IPs. Document the failure mode and a NetworkManager dispatcher-script remediation persisted via an /oem config file, under Known Issues on the v1.6.x-to-v1.7.x upgrade page (docs/ and the v1.7 and v1.8 versioned copies). Signed-off-by: Martino Dell'Ambrogio <tillo@tillo.ch>
Part of the mdapi-wide leak-prevention sweep (2026-05-23): - pre-commit hook (global) + CI `.pre` stage now run gitleaks on every push - `.gitignore` baseline blocks .env/kubeconfig/SSH keys/PKCS12/.netrc Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
After the wicked → NetworkManager migration (Harvester v1.6.x → v1.7.x), the management bridge
mgmt-bris no longer protected against MAC address changes.Under wicked,
/etc/wicked/scripts/setup_bond.shran a bond post-up hook that pinnedmgmt-br's MAC address to the bond MAC. That script still ships in/oem/90_custom.yaml, but it is dead code now that NetworkManager is the active network stack — wicked is inactive, so the hook never runs.mgmt-brthen behaves as a plain Linux bridge and adopts the lowest MAC address among its enslaved ports. Bridge-mode VM networks enslave VMvethinterfaces intomgmt-br, so the bridge MAC can change on any VM start/stop/migrate.This breaks Layer 2 load balancers: MetalLB's L2 ARP responder caches the announce-interface MAC at creation and keeps advertising the stale value. Once the originating
vethis removed, that MAC exists on no interface and every affectedLoadBalancerIP becomes unreachable — an intermittent, hard-to-diagnose outage where the backing pods are healthy, the IPs are reported as assigned, and ARP still resolves, but to a MAC that black-holes traffic.Change
Adds Known Issue #6 to the v1.6.x → v1.7.x upgrade page: the failure mode plus a remediation — a NetworkManager dispatcher script that re-pins
mgmt-brto the bond MAC on every network event, persisted via an/oemconfig file so it survives reboots and upgrades. This restores, under NetworkManager, the behaviour the old wickedsetup_bond.shhook provided.Applied to
docs/and theversion-v1.7/version-v1.8copies.Notes
setup_bond.shso no manual step is needed. Happy to file a separateharvester/harvesterissue for that if maintainers prefer.