feat(trace): SequenceDetector primitive - session-window multi-step attack detection#522
Conversation
Adds StepSpec dataclass and SequenceDetector base structure to finbot/ctf/detectors/primitives/. Includes config validation, get_relevant_event_types(), and stubbed private helpers for history querying, step matching, and time-window checks. check_event() and all helpers are NotImplementedError stubs pending implementation.
…gration - Add SequenceDetector to finbot/ctf/detectors/primitives/ Detects multi-step attack patterns across a session or workflow window. Supports ordered step matching, glob event_type patterns, within_n_events and within_seconds windows, and all ToolCallDetector field operators. Challenge authors configure it from YAML with no Python required. - Add composite index idx_ctf_event_session_ts_type on (session_id, timestamp, event_type) to keep session-window history queries below 10ms p95. - Export SequenceDetector from finbot/ctf/detectors/primitives/__init__.py - Add 17 unit tests covering full sequence detection, partial sequences, order enforcement, session/workflow windows, condition operators, and glob event_type matching.
- Add StepSpec TypedDict to sequence_detector.py matching the approved interface spec; export it from primitives __init__ - Add benchmark test: seeds 1,000 CTFEvent rows with composite index, runs check_event 100 times, asserts p95 < 10ms Current result: p50 ~7ms, p95 ~8ms on SQLite
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a new SequenceDetector primitive for multi-step pattern detection, plus database/index support and accompanying tests/benchmarking to validate correctness and query performance.
Changes:
- Introduce
SequenceDetectorwith configurable step matching over session/workflow windows. - Add Alembic migration for a composite
ctf_eventsindex used by session-window queries. - Add unit tests and a SQLite benchmark to catch functional and query-latency regressions; refactor vendor DB session acquisition to use
get_db().
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
finbot/ctf/detectors/primitives/sequence_detector.py |
New detector implementation: config validation, history query, step/condition matching. |
finbot/ctf/detectors/primitives/__init__.py |
Exports SequenceDetector and StepSpec. |
migrations/versions/2026_06_03_add_ctf_event_session_index.py |
Adds composite index intended to speed up session-window queries. |
tests/unit/ctf/test_sequence_detector.py |
Unit coverage for config validation, ordering, windows, and some condition operators. |
tests/unit/ctf/test_sequence_detector_benchmark.py |
Benchmark test for p95 latency of check_event query path. |
finbot/tools/data/vendor.py |
Switches vendor tools from db_session() context manager to get_db() generator pattern. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- _check_condition: AND all operators instead of returning on the first;
conditions like {gte: 10, lte: 20} now work correctly
- within_seconds: return detected=False with clear message when event has
no timestamp, instead of silently falling back to datetime.now()
- Migration: add namespace as leading index column to match actual query
filter shape (WHERE namespace=? AND session_id=?)
- Benchmark: use StaticPool so all connections share the same in-memory DB;
switch to @pytest.mark.asyncio + await instead of get_event_loop();
adjust SQLite p95 limit to 50ms with note that Postgres target is 10ms
- Remove unused DetectionResult import from test_sequence_detector.py
- Fix within_n_events docstring: describes history window size (latest N events loaded for session), not max events between steps - Gate p95 benchmark behind RUN_BENCHMARKS env var so it does not cause flaky failures on CI runners with variable CPU load; run with: RUN_BENCHMARKS=1 pytest
steadhac
left a comment
There was a problem hiding this comment.
Thanks for this — one blocking issue plus two smaller fixes. Requesting changes on the first.
Blocking (lines 132–162): With order_matters=False, search_from stays at 0, so every step scans history from the start independently. A single event matching multiple steps can satisfy step 1 and step 2 on its own — a 2-step sequence fires from one event, which is exploitable in a CTF. Track consumed history indices and skip them for later steps so no two steps claim the same event.
Should fix (line 255): Unknown/typo operators ("eqals", "startswith") match no elif branch and fall through as "always passes" → false positives. Add an else that raises at config-validation time, or logs a warning and returns False.
Minor (lines 193–197): The 10-string frozenset({...}) is rebuilt on every _matches_step call. Move it to a module-level _CTF_COLUMNS constant.
| elif op == "matches": | ||
| if not re.search(expected, str(actual), re.IGNORECASE): | ||
| return False | ||
|
|
There was a problem hiding this comment.
line 255 - If someone writes a typo ("eqals") or an unsupported operator ("startswith"), none of the elif branches match, so the loop just moves on without returning False. The condition is silently treated as "always passes," which can produce false-positive detections.
How to fix it: Add an else clause at the end of the operator chain that either raises a ValueError at config-validation time, or at minimum logs a warning and returns False at match time:
else:
logger.warning("Unknown condition operator %r — treating as no-match", op)
return False
| "event_type", "event_category", "event_subtype", | ||
| "session_id", "workflow_id", "namespace", "user_id", | ||
| "vendor_id", "agent_name", "tool_name", "severity", | ||
| }) |
There was a problem hiding this comment.
line 193 - 197 The frozenset({...}) literal with 10 strings is created fresh every single time _matches_step is called. Since this runs inside the step-matching loop (once per event × once per step in history), it adds unnecessary allocation overhead. It's not a bug, but it's wasteful.
Move it to a module-level constant:
At module level, near the top of the file
_CTF_COLUMNS: frozenset[str] = frozenset({
"event_type", "event_category", "event_subtype",
"session_id", "workflow_id", "namespace", "user_id",
"vendor_id", "agent_name", "tool_name", "severity",
})
| ) | ||
| if order_matters: | ||
| search_from = found_at + 1 | ||
|
|
There was a problem hiding this comment.
Lines: 132–162 When order_matters=False, search_from stays at 0 for every step. This means each step scans the full history independently from the beginning. If a single event matches two different steps (e.g. event_type = "agent.*.tool_call_success" with no conditions), it will satisfy both step 1 and step 2 on its own, making a 2-step sequence trigger from just 1 event. In a CTF, this is exploitable.
Track which history indices have already been consumed and skip them for subsequent steps. Once an event is matched to a step, mark it as consumed so no other step can claim it
- Track consumed history indices in step matching loop so a single event cannot satisfy two steps when order_matters=False; prevents a 2-step sequence from firing on one event - Add else clause in _check_condition for unknown operators: logs a warning and returns False instead of silently treating as always-passes - Move _ctf_columns frozenset to module-level _CTF_COLUMNS constant to avoid rebuilding it on every _matches_step call
Summary
SequenceDetectortofinbot/ctf/detectors/primitives/— the first detector in FinBot that can match attack patterns spanning multiple events across a session or workflow windowStepSpecTypedDict matching the interface approved during community bonding(session_id, timestamp, event_type)onctf_eventsto keep session-window history queries fastThis is the Week 1 deliverable for TRACE (GSoC 2026).
What it does
All 14 existing detectors fire on a single event.
SequenceDetectorqueriesCTFEventhistory to match an ordered sequence of steps within a configurable window. Challenge authors configure it from YAML — no Python required:Files changed
finbot/ctf/detectors/primitives/sequence_detector.pyfinbot/ctf/detectors/primitives/__init__.pySequenceDetector,StepSpecmigrations/versions/2026_06_03_add_ctf_event_session_index.pytests/unit/ctf/test_sequence_detector.pytests/unit/ctf/test_sequence_detector_benchmark.pyTest plan
Notes for reviewers
within_n_eventswindow usesORDER BY timestamp DESC LIMIT nthen reverses — this keeps the query bounded while preserving chronological order for step matchingdetailsJSON first, then fall back to named CTFEvent columns (tool_name,agent_name, etc.) — this avoids ambiguity between payload fields and model attributesWHERE session_id = ? ORDER BY timestamp ASC;event_typeis included for potential index-only scans