Added SADE agent by YasodGinige · Pull Request #5 · sands-lab/nika

YasodGinige · 2026-05-17T10:19:23Z

We created this pull request at your request in our repository. We have added the SADE agent to the network_agents directory. Please review and merge.

…layout End-to-end step1+step2+step3 on simple_bgp + frr_service_down hung or degraded in five places. Verified fixed by a clean run that diagnoses the fault via the SADE workflow (Skill -> h.py -> helper) and exits cleanly with the eval-pipeline token summary populated. src/agent/claude_code_agent_sade.py - Restore `break` after `ResultMessage` so the async-for loop exits when the SDK emits the result. Without it step3 hangs at the end of every session. - Restore the `token_summary` block + flat `tokens.*` fields in the `llm_end` event. `trace_parser.py` and the evaluation CSV read these flat fields; on the previous wrapper they populated as zero even when a run consumed ~450k input + ~7k output tokens. - Restore the `system_logger.info("REMINDER at API turn N/M")` line, the reproducibility marker for the half-way pivot. - Repoint `cwd` from `Path(__file__).parent` to `parents[2] / "network_agents/SADE/src/agent"` so the SDK loads `.claude/skills/` and `CLAUDE.md` from the embedded SADE engine. The wrapper file lives at `src/agent/...` but the engine lives nested under the contributor-agent dir; the unchanged path resolved to a dir with no skills, and the agent never made a single Skill call. src/agent/utils/mcp_servers.py - Restore the Windows-aware venv-python detection (`sys.platform == "win32"` selects `.venv/Scripts/python.exe`, otherwise `bin/python`, falling back to `sys.executable`) and `BASE_DIR` env forwarding. Hardcoding `"python3"` failed on Windows where that name is not on PATH, so all five MCP servers failed to spawn and the agent saw zero `mcp__*` tools. network_agents/SADE/h.py - Detect the project venv at the outer repo root when SADE is embedded (`../../.venv/`), falling back to the standalone layout (`./.venv/`). Without this the launcher ran helper scripts under the system Python (no `nika` installed) and every helper bailed with a misleading "missing project dependency" error pointing at a path that does not exist on the embedded layout. - Apply the same standalone-vs-embedded fallback to `runtime/current_session.json` lookup so `_child_env` injects the correct `LAB_NAME` instead of silently defaulting to a hardcoded lab. _utils/wireguard/keys.txt + .gitignore - Removed the file (committed in b3aebe5 with four WireGuard private/public key pairs in plaintext) and added the path to `.gitignore`. `kathara_lab_generator.py` raises a clear FileNotFoundError with regeneration instructions when the file is missing, so no scaffold is needed. - NOTE: removal here is the working-tree fix only. The keys are still in git history (commit b3aebe5) and must be rotated, with history scrubbed (BFG / git-filter-repo), before this branch goes public.

…parsing + Windows-safe teardown Surfaced while running step4_result_eval.py end-to-end on the previously committed SADE wrapper fix (d66ccae): no `llm_judge.json` was produced and the eval CSV showed `in_tokens=0, out_tokens=0` for a run that actually used ~380k input + ~5k output tokens. All five reverts hash- match upstream main; the only non-revert addition is the explanatory comment. src/nika/evaluator/trace_parser.py - Restore `encoding="utf-8"` on the conversation log read. Windows defaults to cp1252 and the SADE run logs contain en-dashes, arrows, and other non-ASCII characters; without UTF-8 the parser crashed before counting anything. - Restore the `_accumulate_tokens()` dispatch covering all three llm_end token formats: LangGraph `usage_metadata`, Claude Code SDK `tokens.total_input_tokens`, and a raw Anthropic `usage` fallback. The simplified version only handled the LangGraph schema, so every Claude Code / SADE run wrote `in_tokens=0, out_tokens=0` to the eval CSV even when the conversation log captured the real counts. This pairs with the wrapper's `tokens.*` field restored in d66ccae. src/nika/evaluator/llm_judge.py - Restore `encoding="utf-8"` on both the trace read and the judge JSON write. The judge response routinely contains `->`, arrows, and other non-ASCII characters; without UTF-8 the write crashed with `UnicodeEncodeError 'charmap'` and no `llm_judge.json` was produced. src/nika/utils/logger.py + src/nika/utils/session.py - Restore the `close_logger()` function in `logger.py` (closes and detaches every handler on the `SystemLogger`) and its call at the top of `Session.clear_session()`. Without releasing the file handle first, `shutil.move("runtime/system.log", ...)` raises `PermissionError [WinError 32]` on Windows because the file is still open; the fallback copy-then-unlink leaves both the source and destination on disk and the eval script aborts mid-teardown. src/scripts/step4_result_eval.py - Restore `encoding="utf-8"` on the `ground_truth.json` and `submission.json` reads (same cp1252 fallback risk as above for any payload containing non-ASCII tokens). - Restore the `try/except` wrapper around `net_env.undeploy()`. The eval row is already written by `_eval_problem` before this point; a transient Docker 403 "active endpoints" race during undeploy is cleanup-only and must not bubble up, or the benchmark runner will record a duplicate SKIP row for a case that actually succeeded. Verified by a full clean step1 -> step2 -> step3 -> step4 on `simple_bgp + frr_service_down`: - step4 exits 0 (was: PermissionError teardown crash) - `runtime/` empties on teardown (was: orphaned `system.log`) - eval CSV: `in_tokens=379311, out_tokens=4972` (was: `0, 0`) - `llm_judge.json` produced with scores 5/5 across relevance, correctness, efficiency, clarity, final_outcome, overall - detection / localization / rca all 1.0

alessandrocornacchia

@YasodGinige please review with the following changes:

Remove all experiments-related artifacts: remove SADE figures, tables, execution artifacts.

Conform to the existing project structure, and do not use a separate folder network_agents for the new contributions. Suggestions:

if you made changes to nika (e.g., fixed failure injectors, added new problems, or added new network topologies and configurations) pull from main the latest commit, and merge your changes there. Do not add duplicate code as a separate folder.
move all changes that include the SADE agent (e.g., prompts, SADE skills, agent code) under src/agent/community/sade. You can create the community/sade sub-folder.

Briefly document relevant changes: please add a comment in your PR discussing an overview of the changes and why they have been necessary.

Overlxrd-uwu · 2026-06-26T13:24:28Z

Dear Alessandro,

Thank you very much for the detailed review and suggestions. We've addressed all of them in a new pull request, #6, which we rebased onto the latest main, so it fits the current pluggable agent architecture.

In short: SADE now lives under src/agent/community/sade/ as a pluggable "-a sade" agent (implementing protocols.TroubleshootingAgent), the separate network_agents/ folder and all experiment artifacts have been removed.

Since this branch is now outdated and conflicts with the current main, we'd suggest continuing the review on #6 - happy to close this one in its favor whenever you prefer.
Thanks again for your guidance!

Best regards,
Kuanhao Tseng Jeff (working with @YasodGinige on the SADE contribution)

YasodGinige and others added 4 commits May 15, 2026 18:52

added SADE agent to the agent pool

b3aebe5

added SADE agent

0353e94

YasodGinige mentioned this pull request May 18, 2026

Req: Merge NIKA additions with upstream Overlxrd-uwu/SADE-NetworkAgent#3

Open

alessandrocornacchia requested a review from Copilot June 23, 2026 10:33

Copilot started reviewing on behalf of alessandrocornacchia June 23, 2026 10:33 View session

This comment was marked as low quality.

Sign in to view

alessandrocornacchia requested changes Jun 24, 2026

View reviewed changes

alessandrocornacchia added the enhancement New feature or request label Jun 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added SADE agent #5

Added SADE agent #5
YasodGinige wants to merge 4 commits into
sands-lab:mainfrom
YasodGinige:nika

YasodGinige commented May 17, 2026

Uh oh!

This comment was marked as low quality.

Uh oh!

alessandrocornacchia left a comment •

edited

Loading

Uh oh!

Overlxrd-uwu commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

YasodGinige commented May 17, 2026

Uh oh!

This comment was marked as low quality.

Uh oh!

alessandrocornacchia left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Overlxrd-uwu commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alessandrocornacchia left a comment •

edited

Loading