Skip to content

Replace _normalize_attack_results() duck-typing with a typed AttackResult model #399

@franconicola

Description

Problem. attacks/orchestrator.py flattens heterogeneous returns by trying .evaluated, .rows, .results, .data, .items in turn. Any new technique that names a field differently will silently mis-normalize.

Actions.

  • Define AttackResult (Pydantic v2, frozen) in attacks/types.py with fields: goal, prompt, response, evaluations: list[Evaluation], metadata: dict, etc.
  • Have BaseAttack.run() declare a return type of list[AttackResult].
  • Update each of the 10 techniques in attacks/techniques/ to return the typed model.
  • Delete _normalize_attack_results().
  • Add a unit test per technique asserting return type.

Acceptance: _normalize_attack_results removed; mypy/pyright clean on the orchestrator path.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions