Feature/multiserver plugin#3421
Conversation
|
|
||
| ```{warning} | ||
| * The *multiserver* plugin requires the **containerlab** provider on all servers. | ||
| * Containerlab version >= `0.46` is required for native VXLAN link endpoint support. |
There was a problem hiding this comment.
Just a thought: the plugin probably does not work with old releases, and we're enforcing 0.75.0 right now.
There was a problem hiding this comment.
What do you mean?
Well I only tried it on 0.75 true but it should work before too. It only requires a netlab version which uses the pickel system for its storage.
Netlab already enforces a higher version than 0.46 i think no?
There was a problem hiding this comment.
Netlab already enforces a higher version than 0.46 i think no?
That was the point. You don't have to mention containerlab version.
ipspace
left a comment
There was a problem hiding this comment.
Super-awesome-job!!! Thanks a million.
Tons of comments (as you expected ;). Some of them are just suggestions or pointers to existing helper functions, in other cases I think we can make the whole thing a lot more streamlined with significant rewrites.
| (multiserver-servers)= | ||
| ### Server Parameters | ||
|
|
||
| Each entry in the **multiserver.servers** list supports these parameters: |
There was a problem hiding this comment.
Would it make more sense to have a dictionary of servers?
|
|
||
| | Parameter | Type | Meaning | | ||
| |-----------|------|---------| | ||
| | **id** | integer | Unique identifier for the server (e.g. `1`, `2`) | |
There was a problem hiding this comment.
ID could be assigned automatically, like we do for nodes. We even have a set of functions (modules/_dataplane) to handle IDs where some objects have a static ID and others need auto-assigned ones -- used for VLANs, VRFs, and the like
| In `auto` mode, nodes that are not explicitly pinned to a server are distributed automatically using a greedy balancing algorithm: | ||
|
|
||
| 1. Nodes belonging to a *netlab* group are kept together — the entire group is placed on the server that currently has the fewest nodes. Larger groups are placed first for better balance. | ||
| 2. Remaining ungrouped nodes are assigned one at a time to the least-loaded server. |
There was a problem hiding this comment.
Haven't looked at the code yet, but I'm guessing that "loaded" means "number of nodes", not something more complex like CPU/RAM requirements of lab devices? If that's the case, it might be worth spelling it out.
|
|
||
| ### Automatic Assignment | ||
|
|
||
| In `auto` mode, nodes that are not explicitly pinned to a server are distributed automatically using a greedy balancing algorithm: |
There was a problem hiding this comment.
In some future version, you might want to add server capabilities for weighted distribution ;)
|
|
||
| The plugin automatically copies all required files into each server directory — no extra bundling step is needed. | ||
|
|
||
| **Step 2: Copy server directories to remote hosts** (e.g. via rsync): |
There was a problem hiding this comment.
We might want to automate that in the future, but this is definitely more than good enough for version 1
|
|
||
| def _intf_clab_name(intf: Box) -> str: | ||
| """Containerlab interface name for a node interface.""" | ||
| return intf.get("clab", {}).get("name", "") or intf.get("ifname", "") |
There was a problem hiding this comment.
intf.get("clab.name","") or intf.get("ifname","")
See "Using Box objects" in AGENTS.md
| return intf.get("clab", {}).get("name", "") or intf.get("ifname", "") | ||
|
|
||
|
|
||
| def _build_clab_node(nname: str, ndata: Box, topology: Box) -> dict: |
There was a problem hiding this comment.
It looks like this whole part could be easier to do with a topology filter (something similar to "remove unmanaged nodes") followed by an augmented clab.yml Jinja2 template. I also wouldn't have a problem adding VXLAN support (as clab attributes) into netlab core and then using those attributes here.
|
|
||
| # =========================================================================== | ||
| # Internal helpers — clab.yml generation | ||
| # =========================================================================== |
There was a problem hiding this comment.
This whole section should be completely restructured. We're reinventing the wheel (in Python)
| topo_copy.links = [l for l in topo_copy.links if any(i.node in local_nodes for i in l.get("interfaces", []))] | ||
|
|
||
| # Expand paths (add f_files / f_tasks / f_dirs computed keys). | ||
| make_paths_absolute(topo_copy.defaults.paths) |
There was a problem hiding this comment.
I'm afraid there are lots of assumptions here, starting with "the directory structure MUST be the same on all the servers and the control node"
| pickle.dump(topodict, f) | ||
|
|
||
|
|
||
| def _write_vxlan_scripts(out_dir: str, tunnels: list, dev: str) -> None: |
There was a problem hiding this comment.
Any reason why you wouldn't use a Jinja2 template for this?
Reference: #3420
Summary
This PR adds the
multiserverplugin to distribute a single Netlab topology across multiple physical servers.Sadly for now containerlab-provider only.
Key Details
netsim/extra/multiserver/and doesn't modify any core Netlab engine logic.clab.ymlandnetlab.snapshot.pickle.sudo netlab up --snapshot -vvwithout needing custom CLI options.For the test-files I am not sure if they make any sense. But they show at least it does not interfere with the normal netlab workflow.
Explanations on how it works can be found in
docs/plugins/multiserver.md.