feat(v1.5-C): per-agent LLM proof point — intake on Ollama Cloud#9
Merged
Conversation
…nstream on llm.default
Activates the M8 milestone's per-agent provider story. The framework
already resolved ``skill.model`` per-skill via
``graph.py:_build_agent_nodes -> get_llm(cfg.llm, skill.model, role=...)``;
v1.5-C uncomments the override on the example incident_management
intake skill so a default deployment (with both OLLAMA_API_KEY +
OPENROUTER_API_KEY set, or wherever ``llm.default`` resolves) shows
intake hitting Ollama Cloud's gpt-oss while the rest of the agents
follow the runtime default.
Changes:
* ``examples/incident_management/skills/intake/config.yaml`` — declare
``model: gpt_oss_cheap`` (was a documented but commented-out hint).
Comment block updated to reference v1.5-C and explain the resolver.
* ``src/runtime/config.py`` — extend the ``LLMConfig.stub()`` default
models map with stub aliases for ``gpt_oss``, ``gpt_oss_cheap``, and
``workhorse``. The skill-validator (``Orchestrator.create``) checks
every ``skill.model`` against ``llm.models``; without these aliases
the existing test suite would explode the moment intake declares
``model: gpt_oss_cheap`` (because tests build ``LLMConfig.stub()``
which previously only knew ``stub_default``). The aliases route to
the same stub provider so behaviour is unchanged for stub-mode
callers.
* ``tests/test_per_agent_model_dispatch.py`` (new, 2 tests) — pin
the dispatch contract:
- ``test_build_agent_nodes_passes_skill_model_to_get_llm`` mocks
``runtime.graph.get_llm`` and asserts the framework calls it
with ``model_name=skill.model`` per skill (intake gets
``"gpt_oss_cheap"``, triage with ``model=None`` gets ``None`` so
``get_llm`` falls back to ``llm.default`` downstream).
- ``test_intake_skill_yaml_has_per_agent_override_uncommented``
pins the YAML edit so a future refactor can't silently drop the
override.
Live verification (``tests/test_integration_driver_s1.py`` family)
continues to require ``OLLAMA_API_KEY`` + ``OPENROUTER_API_KEY``
+ ``OLLAMA_BASE_URL`` and remains skipped without them — the human
verification gate documented at
``.planning/phases/15-real-llm-tool-loop-termination/15-VERIFICATION.md``.
Suite: 1260 passed (was 1258 — added 2), ruff clean, coverage 87.08%.
Bundles dist/app.py + dist/apps/{code-review,incident-management}.py
in line with the LLMConfig stub-aliases extension from the preceding
commit. No bundle-only edits.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
Last sub-milestone of v1.5. Activates the framework's per-agent provider story (M8 from the v1.4 telemetry milestone): intake runs on Ollama Cloud
gpt-oss:20bwhile downstream agents followllm.default.The plumbing was already in place —
runtime.graph._build_agent_nodescallsget_llm(cfg.llm, skill.model, role=…)for every responsive skill, andget_llmfalls back tollm.defaultwhenmodelisNone. v1.5-C just uncomments the override on the example incident-management intake skill so any deployment with both keys set (or whereverllm.defaultresolves) demonstrates the divergence end-to-end.Changes
examples/incident_management/skills/intake/config.yamlDeclare
model: gpt_oss_cheap(was a documented but commented-out hint). Comment block updated to reference v1.5-C and explain the resolver path.src/runtime/config.pyExtend
LLMConfig.stub()default-models map with stub aliases forgpt_oss,gpt_oss_cheap,workhorse. The skill-validator (Orchestrator.create) checks everyskill.modelagainstllm.models; without these aliases the existing test suite would explode the moment intake declaresmodel: gpt_oss_cheap. The aliases route to the same stub provider so behavior is unchanged for stub-mode callers.tests/test_per_agent_model_dispatch.py(new, 2 tests)test_build_agent_nodes_passes_skill_model_to_get_llm— mocksruntime.graph.get_llm, asserts the framework calls it withmodel_name=skill.modelper skill (intake gets"gpt_oss_cheap", triage withmodel=NonegetsNone).test_intake_skill_yaml_has_per_agent_override_uncommented— pins the YAML so a future refactor can't silently drop the override.Live verification
tests/test_integration_driver_s1.pyexercises the live multi-provider path withOLLAMA_API_KEY+OPENROUTER_API_KEY+OLLAMA_BASE_URLset. It's gated and skipped without them — the human verification gate at.planning/phases/15-real-llm-tool-loop-termination/15-VERIFICATION.md.Test plan
uv run ruff check src/ tests/— cleanuv run pytest -x— 1260 passed (was 1258), 7 skippeduv run pytest --cov=src/runtime --cov-fail-under=85— 87.08%🤖 Generated with Claude Code