Skip to content

bug: breeze lm links crashes with KeyError: 'metric' when OpenR is running in area-aware mode with asymmetric link metrics #163

@AkshatRaj00

Description

@AkshatRaj00

Bug Report

OpenR version: openr-20240501 (latest OSS build)
OS: Ubuntu 22.04 LTS (x86_64)
Python: 3.10.12


Summary

Running breeze lm links on an OpenR instance configured with area-aware link metrics causes a KeyError: 'metric' crash in the Python CLI. The crash happens because the LinkMonitor Thrift struct for links in non-default areas omits the top-level metric field when area_policies are active, but the breeze display code unconditionally accesses link.metric without checking for its presence.

This is a silent regression — it only surfaces in multi-area deployments, which are increasingly common in data-center spine/leaf topologies.


Steps to Reproduce

  1. Configure OpenR with multiple areas using area_policies in openr_config.thrift:
{
  "areas": [
    { "area_id": "0", "interface_regexps": ["eth0", "eth1"] },
    { "area_id": "1", "interface_regexps": ["eth2", "eth3"] }
  ],
  "area_policies": {
    "0": { "import_policy": "ACCEPT_ALL" },
    "1": { "import_policy": "ACCEPT_ALL" }
  }
}
  1. Bring up OpenR with at least one link in each area.

  2. Run:

breeze lm links

Result:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/openr/cli/commands/lm.py", line 134, in _run
    rows.append([link.ifName, link.metric, link.isUp, ...])
KeyError: 'metric'

Expected result: A table showing all links with their per-area metrics (or N/A where metric is unset).


Root Cause

In area-aware mode, links in non-default areas use adj_metric from the area policy instead of populating the top-level metric field in the LinkEntry Thrift struct. The Python CLI code in openr/py/openr/cli/commands/lm.py calls link.metric unconditionally:

# lm.py ~line 134 (approximate)
rows.append([
    link.ifName,
    link.metric,   # <-- KeyError when area_policies override metric
    link.isUp,
    ...
])

The fix is straightforward: use getattr(link, 'metric', 'N/A') or check hasattr before access.


Proposed Fix

# Before
link.metric

# After
getattr(link, 'metric', 'N/A')

Alternatively, the display code should be updated to read from adj_metric when metric is absent, showing the effective metric for each area.


Impact

This bug silently breaks the primary CLI diagnostic tool (breeze lm links) for any operator running OpenR in a multi-area configuration. Since multi-area is the recommended deployment model for large-scale fabrics, this is a high-impact issue for production operators who rely on breeze for day-to-day troubleshooting.


Environment

  • OpenR: openr-20240501
  • Build method: build/build_openr.sh from main branch
  • Python: 3.10.12
  • Config: multi-area with area_policies
  • OS: Ubuntu 22.04

Related: #72 (Python module issues in breeze), #134 (other breeze CLI runtime errors)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions