Skip to content

chore(release): v1.1.0#49

Merged
liamadale merged 112 commits into
mainfrom
dev
Jun 14, 2026
Merged

chore(release): v1.1.0#49
liamadale merged 112 commits into
mainfrom
dev

Conversation

@liamadale

Copy link
Copy Markdown
Owner

Promotes 103 commits accumulated on `dev` since v1.0.0.

See CHANGELOG.md for the full release notes.

Highlights

  • Phase 1 incident correlator + MCP `investigate` tool + web investigate pages
  • Public IP profiling integrated into web device cards
  • New Mantis Explorer app; `mantis_web` renamed to `threat_model`
  • Hub redesign with settings page, 8 community themes, live data freshness
  • 12 new MCP OpenSearch tools (investigate, histogram, aggregate, bulk_enrich_ips, count, FP filter mgmt, ...)
  • Querier perf: asyncio.gather overview, ETag/single-flight dedup, filter-loader mtime cache
  • OpenSearch mapping-drift fix for `terms` aggs across rolled-over indices

Breaking changes

Internal-only — no external consumers, but worth flagging for the bisect record:

  • Package rename: `mcp/` → `mcp_servers/` (namespace collision with upstream)
  • App rename: `mantis_web` → `threat_model`

Test plan

  • CI passes (pytest + ruff blocking on PR to main)
  • Smoke-test `python run_all.py` and visit each app
  • Verify MCP servers load

… search bar

Extend the query backend and web form to support filtering by destination
IP alongside the existing src_ip filter.

- build_base_query gains a dest_ip_filter param that appends a
  destination.ip term clause when set
- run_query guards dest_ip_filter the same way src_ip is guarded:
  only applied to modules that declare destination.ip in SOURCE_FIELDS
- build_search_params_from_request reads the dest_ip form value
- base.html search bar exposes a Dst IP text input
Zeek modules emit "—" (U+2014) as src_ip when no source IP is present
in a record. run_cross_protocol_query was aggregating these into a
phantom "—" row in the cross-protocol overview matrix.

Extend the IP guard to also skip em-dash values alongside empty strings.
…and fix sticky-column rendering

Sidebar navigation:
- Add always-visible Overview link (cross-protocol matrix) to sidebar
- Replace Hub home link with brand logo that routes to Hub when mounted
  under a script name, or to Overview when running standalone
- Polish collapsed sidebar: hide scrollbar, center icons, apply category
  colour accents to group headers, add .sidebar-overview highlight class
- Update category icons (alerts, network, web, remote, auth, messaging)
- Reorder MODULES so suricata_alert appears before weird in the registry

Filter persistence:
- Replace localStorage-based sensor/time_range persistence with
  sessionStorage so each browser tab maintains independent filter state
- Snapshot all filter keys (time_range, sensor, src_ip, dest_ip,
  direction, public_only, limit, min_risk) on form submit; restore on
  link navigation
- Remove inline onchange handler from the time_range select

CSS cache-busting:
- Compute pisces.css version from file mtime at app startup and inject
  as a ?v= query param in the stylesheet link

Sticky-column rendering:
- Raise col-ip-addr z-index to 20 (top-left corner) and add explicit
  top+left sticky declarations to thead .col-ip-addr and thead .col-total
  so both axes stick reliably at the corner
- Remove drop shadows from col-ip-addr and col-total; keep only the
  inset separator line
- Remove z-index from the shared thead rule to avoid clobbering the
  per-column corner values
Introduces djLint as a dev dependency and wires it into both pre-commit
and the GitHub Actions CI workflow, following the same advisory-on-dev /
blocking-on-PR-to-main pattern used by the existing ruff checks.

- djlint-jinja pre-commit hook lints Jinja templates on every commit
- CI: "Lint HTML (djlint check)" runs blocking on PRs to main,
  advisory on pushes to dev (mirrors ruff check behaviour)
- CI: "Format check HTML (djlint format)" runs advisory on all triggers
- [tool.djlint] config added to pyproject.toml: jinja profile,
  100-char line limit, H021/H023/H030/H031 suppressed with rationale
Resolves all djlint warnings to reach 0 errors across 32 HTML files:

- T003: added block names to all bare endblock tags (21 occurrences)
  across dashboard_web, mantis_web, and opensearch_web templates
- H006: added height/width attributes to brand logo img tags in 4
  base templates
- H025: added closing </option> tags to datalist options in
  filter_form.html
- H014: removed extra blank lines in opensearch_web base.html and
  record_detail.html
- T032: removed extra whitespace in Jinja set tags in ticket_detail.html
  and record_detail.html
- H029: lowercased form method="GET" to method="get" in opensearch_web
  base.html
- H020: replaced empty <span></span> with <span>&nbsp;</span> in
  threat_card.html

Also adds J018 to the djlint ignore list in pyproject.toml — cross-app
internal links cannot use url_for() in a multi-app Flask setup.
…n tickets

Replace the single loose _ESCALATION_RE pattern with a two-stage function
_note_is_escalation() that eliminates false positives:

- Stage 1: matches past-tense client-contact phrases (informed/notified the
  client, let the client know, reached out to the client, etc.) and skips
  any match whose 20-char prefix contains "will" (future intent).
- Stage 2: matches past-tense "escalated [this/it] to [the] client" and
  skips matches prefixed with "not" or "won't" (negated intent).

Previously, bare "escalat*" triggered on "privilege escalation", conditional
futures ("will let the client know"), and negations ("not going to escalate
to the client"). The two-stage approach targets only confirmed past-action
phrases.

_normalize_issue() now exposes is_escalated and escalated_by on every
normalised ticket dict. activity_report._ticket_ref() propagates is_escalated
into StudentStats.created_tickets. student_activity adds an "Escalated"
column to the summary table, shows a per-student count in the detail view,
supports --org / --since / --until CLI flags, and refactors the graph helper
into a reusable _plot_ticket_timeline() shared by per-student and per-org
views.
…loration

New app at /mantis-explorer (port 5003 standalone via apps/mantis_explorer/run.py)
that provides a browser-based view of the data from student_activity/activity_report.

Key capabilities:
- Institution overview table with ticket counts, escalation counts, and date ranges
- Per-institution student breakdown with sortable activity table
- Per-student slide panel showing created tickets (with escalated row tinting and
  red exclamation icon) and notes, plus a ticket detail view with an "Escalated"
  badge in the meta row when is_escalated is true
- Resizable ticket slide panel with drag-to-resize handle; width persisted to
  localStorage across page loads
- Charts: submission timeline, org bar chart, and escalation breakdown
- Date-range filtering propagated from the URL query string
- Dark SOC-analyst CSS theme consistent with the rest of the PISCES UI (me.css)
Mount the new mantis-explorer app at /mantis-explorer in the
DispatcherMiddleware and add a hub card linking to it with a
fa-user-graduate icon.

Also bumps locked dependency versions: rich 14→15, ruff 0.15.6→0.15.12,
pre-commit 4.5.1→4.6.0, geoip2 >=4.8.0→>=5.2.0, pytest >=9.0.3.
app.py imported TICKETS_BY_ID from apps.mantis_explorer.data, but
that symbol was never re-exported there, causing an ImportError on
startup. Adds the import alongside the existing _raw_tickets alias.
Moving match clauses from `must` to `filter` context skips relevance
scoring, which improves query performance and enables result caching
at the shard level in OpenSearch.

Added `?timeout=30s` to the query URL so long-running queries fail
fast instead of hanging indefinitely. The filter-context change is
applied consistently across `build_base_query`, `list_sensors`, and
`list_log_types`.
Five tests were checking query structure under the `must` key. Now
that `build_base_query` emits clauses under `filter`, the assertions
are updated to match.
…n every query

`load_filters()` was re-reading and re-parsing every YAML file in
`filters/` on each call. A module-level dict now caches the result
keyed by `(filters_dir, municipality)`. The cache is invalidated
whenever the highest mtime across all YAML files changes, so filter
edits are still picked up without a process restart.
…ild_base_query

`module.build_extra_must()` returns a `(clauses, extra_data)` tuple.
`--dump-query` was passing the full tuple as `extra_must`, so the
generated query body contained a tuple instead of a list of ES
clauses. Destructure with `extra_must, _` to extract only the clauses.
…on handler

The bare `except Exception` in `run_cross_protocol_query` was silently
swallowing per-protocol failures, making it impossible to diagnose
which protocol was erroring and why. Now logs the failing protocol
name and exception message via the Rich console.
get_tickets_for_ip() was iterating over all raw tickets on every request
to filter by IP membership. It now uses the pre-built TICKETS_BY_IP and
TICKETS_BY_ID dicts for O(1) lookup.

Startup row lists (MALICIOUS_ROWS, FP_ROWS, INFRA_ROWS, DNS_RESOLVER_ROWS,
UNDETERMINED_ROWS) are now pre-sorted at module load time by each table's
default sort key, so repeated calls hit Timsort's O(n) already-sorted fast
path instead of a full O(n log n) sort on every request.
mantis_index.py: wrap all API calls in a requests.Session for connection
pool reuse. When total_count is known upfront, remaining pages are now
fetched concurrently via ThreadPoolExecutor(max_workers=8) and assembled
in page order for a stable index. Falls back to sequential pagination only
when total is unknown and no page cap is set.

mantis_search.py: switch search_via_api() to requests.Session so all
pages within a single search share one TCP connection.

Also removes student_activity.py — a 795-line near-identical duplicate
of activity_report.py with no external callers.
Four cases: normal lookup returns tickets sorted newest-first, empty IP
returns an empty list, missing ticket IDs in TICKETS_BY_ID are silently
skipped, and results are stable when multiple tickets share a created_at.
Add a new Tickets tab backed by a tickets Blueprint that surfaces
escalation rates, institution breakdowns, and ticket volume timeline
from mantis_explorer data.

Add a sensor browser modal (satellite-dish button) that fetches
hedgehog sensor activity via a new /api/dashboard/sensors endpoint,
renders a proportional bar list, and threads the selected sensor
filter through the OpenSearch and overview sections. A badge on the
toolbar button shows how many sensors are active.
Register the tickets Blueprint in app.py and add the
/api/dashboard/sensors route that serves sensor_summary.html content.

In dashboard.html: add the Tickets tab button, sensor-badge button
with JS state (_selectedSensors, getSensorParam), date picker row
that appears for Mantis and Tickets tabs, and openSensorModal /
applySensorSelection helpers.

In base.html: add sensor-modal markup (backdrop + panel), CSS for
.sensor-badge and #sensor-btn, pill-style sub-tab bar, uniform
design tokens (--radius-*, height:28px inputs, rounded toolbar),
and dark/light color-scheme for native date inputs.
…a charts

Remove agg_opensearch_protocols (single-window terms agg on
event.dataset) and replace with three date_histogram aggregations:
agg_notice_over_time, agg_suricata_over_time, and
agg_conn_volume_over_time. All three accept an optional sensors filter
and select an appropriate interval via _interval_for_range.

Add agg_suricata_alert_count (excluding SURICATA STREAM* noise),
agg_new_ips_delta (current vs previous window unique public IPs), and
parse_sensors (comma-separated sensor param → list|None).

Thread the sensor param through the opensearch section route and
include it in the cache key.
…nify to hbars

Remove panels that added noise without actionable insight: DNS query
types and rcodes, HTTP methods, SSL versions and cipher suites, and
conn_states. Remove the donut chart renderer entirely — all remaining
charts use the horizontal bar (hbar) helper for visual consistency.

In section.html, replace the protocol breakdown bar chart with three
stacked area charts (notices, Suricata alerts, connection volume) using
the new over-time data from the aggregation layer.
…workqueues

Replace the verdict donut + summary table with an alert trend area
chart (Zeek Notices and Suricata Alerts on one axis) and two new
actionable tables: "IPs Without Tickets" and "Untriaged High-Activity
IPs", both derived from the cross-protocol query.

agg_cross_source_ips now returns untriaged, no_ticket, alerts_no_ticket,
and total_ips alongside the main opensearch list, and exposes per-IP
notice and suricata hit counts (from per_protocol in run_cross_protocol_query).
agg_overview fetches notice/suricata over-time and agg_new_ips_delta
in parallel and propagates the sensor filter to all concurrent tasks.

The section route and cache key are updated to include the sensor param.
…ts panel

Replace the time_range query param with since/until date pickers so
analysts can filter by an explicit calendar window rather than a
rolling OpenSearch interval (Mantis data is indexed by created_at, not
@timestamp).

Add _filter_tickets and _filter_malicious helpers that apply
since/until bounds to _raw_tickets and MALICIOUS_ROWS respectively.
All public aggregation functions (agg_mantis_attack_types,
agg_mantis_timeline, agg_mantis_top_ips) now accept and apply these
filters.

Remove agg_mantis_blocklists — the blocklist sources chart was noisy
and rarely actionable; its space is given to an enlarged timeline chart.
Replace stale Mantis and Kibana screenshots with current Dashboard
and Threat Model images. Update the app table to list all four apps —
OpenSearch, Threat Model, Dashboard, and Mantis Explorer — with
accurate descriptions matching the renamed panels.
The app's scope has always been broader than Mantis ticket browsing —
it covers threat modelling, IP verdict classification, FP/TP tracking,
and blocklist analysis. The old name caused confusion and leaked an
implementation detail (Mantis) into the app identity.

- Rename apps/mantis_web/ → apps/threat_model/ (all files)
- Rename mantis_web_run.py → threat_model_run.py and update shim path
- Change dispatcher mount from /mantis to /threat-model in run_all.py
- Update hub card link and all cross-app imports to apps.threat_model
- Rename tests/test_mantis_web_helpers.py → test_threat_model_helpers.py
- Update docs/advanced-usage.md standalone launcher command
…ness

Replace the card grid with a compact list layout for better scanability.
Add a timestamp row showing how recently the ticket index and threat model
were updated, read directly from local data files — no API calls required.
The per-app shims (opensearch_web_run.py, threat_model_run.py,
dashboard_web_run.py) are superseded by run_all.py. Remove them
and drop the standalone launch examples from advanced-usage.md.
Remove the --retrain --classify-stats --use-ml flags from the
command snippet. The plain invocation is the correct default
and the flags were stale guidance.
liamadale added 21 commits May 11, 2026 21:58
….yaml

Added _load_categories() which parses filters/categories.yaml into a
{category: set(subcategories)} registry. load_filters() now checks each
filter file's category and subcategory fields against this registry and
appends a descriptive error for any stale or unrecognised values.

This catches broken references at load time rather than silently
accepting filter files that reference removed categories.
Added 14 pisces-* console scripts to [project.scripts] covering all CLI
tools and standalone web servers, so installed users can launch them
without knowing the module path. Added main() callables to all run.py
launchers and run_all.py to satisfy the entry point protocol.

Dependencies: httpx>=0.28.1 (sync + async OpenSearch client),
flask[async] (brings in asgiref for async view support).
…ming

prewarm_enrichment_cache(ips, max_ips=50) filters out already-cached and
private IPs, then submits the remainder to a daemon ThreadPoolExecutor
(max 4 workers) as fire-and-forget enrich_ip() calls. Errors are
suppressed — pre-warming is best-effort and must never block callers.

The 50-IP cap prevents hitting API rate limits on large result sets.
…w route

run_cross_protocol_query_async uses asyncio.gather across all IP-capable
log types with a shared httpx.AsyncClient, replacing the previous
ThreadPoolExecutor approach. This avoids per-request thread creation and
allows the connection pool to be reused across the concurrent sub-queries.

The overview route is now async and calls prewarm_enrichment_cache after
results arrive, so enrichment data is ready in the background by the
time the user clicks an IP.
Three modules — mantis_index, mantis_search, and activity_report —
each imported urllib3 and called disable_warnings independently.
Moved the single call into src/mantis/__init__.py so it fires once
on package import, removing the duplication without changing behavior.
cryptography had no import anywhere in src/, apps/, or mcp/ — it was
an accidental transitive dependency that snuck into the explicit list.

geoip2 is only lazily imported inside ticket_enrichment/offline.py with
a graceful-degradation fallback, so it does not belong in the mandatory
dependency list. Moved it to the offline-enrichment optional extra (and
all) alongside pyasn, which has the same usage pattern. Updated the
extra's comment to document the GeoLite2-City.mmdb requirement as well.
…t-filtering

_apply_dest_ip_filter() was running as a client-side pass over already-
truncated result sets, meaning queries with a dest_ip constraint and a
limit of N would silently return fewer than N records — or none — even
when matching documents existed in the index.

Fix: pass dest_ip through _base_params() so it reaches run_query() and
becomes a bool.filter term clause in the Elasticsearch request body,
consistent with how src_ip has always been handled. Remove
_apply_dest_ip_filter and all 16 call sites across every protocol tool.

Two regression tests added to tests/test_zeek_base.py:
- dest_ip_filter alone appears in the ES filter clauses
- src_ip_filter and dest_ip_filter coexist correctly in the same query
… absolute timestamps

Three Tier 2 query capability improvements wired from the MCP tool
signatures down through builder.py and runner.py to the ES query:

- §4.3: dest_port/src_port/proto filters added to search_conn; dest_port
  added to search_http, search_ssl, search_rdp, search_smb, search_ssh.
  Values are passed through to build_base_query as term/terms clauses
  on destination.port, source.port, and network.transport. Port and
  proto filters are only applied when the module's SOURCE_FIELDS
  includes the corresponding ES field.

- §4.4: src_ip, dest_ip, and sensor now accept list[str]. A list
  generates a terms clause; a scalar string retains the existing
  term clause. sensor list bypasses the legacy comma-split path.

- §4.6: time_from/time_to absolute ISO 8601 timestamps exposed on all
  16 per-protocol search tools. When provided they override time_range
  and are passed directly to the existing range clause in build_base_query.

Tests added for all three features in tests/test_zeek_base.py.
…ysis

Introduces query_histogram() in src/querier/histogram.py as a shared
helper that runs a date_histogram ES aggregation over any Zeek log type.
It reuses build_base_query() and load_with_remap() so all existing
filters, sensor lists, src/dest IP filtering, and absolute timestamp
support carry over automatically.

- src/querier/histogram.py: core helper, returns [{key, key_as_string,
  doc_count}] buckets; empty list on connection failure
- src/querier/histogram_cli.py: pisces-histogram CLI with Unicode block
  bar chart (_render) and sparse time-axis labels (_time_axis);
  supports --interval, --time-range, --time-from/to, --src-ip,
  --dest-ip, --sensor, --no-filters
- mcp/opensearch/server.py: histogram() MCP tool (§4.9) wrapping the
  core helper; returns JSON with bucket_count, total_events, buckets
- pyproject.toml: register pisces-histogram entrypoint
- tests/test_histogram.py: 11 tests covering _render edge cases,
  _time_axis, and query construction (datasets, None response, list
  src_ip → terms clause, absolute timestamps)
…ggregate tools

Replace the three separate pivot_ip, profile_device, and investigate MCP
tools with a single pivot(ip, mode, ...) tool. mode="records" covers the
former pivot_ip behaviour, mode="profile" the former profile_device, and
mode="incident" the former investigate (requires dest_ip). This reduces
surface area for AI assistants and enforces the dest_ip guard at the tool
boundary rather than relying on callers to invoke the right tool.

Replace aggregate_by_source_ip with a generic aggregate(field, ...) tool
that accepts any ES field name and optional log_type/notice_type filters,
making it a general-purpose frequency-ranking primitive instead of a
notice-specific one.

Implements §2.1 and §2.2 from the MCP improvements plan (Tier 4).

Test suite updated to match new signatures; two new guard tests added for
the incident mode validation paths, and three new tests cover aggregate.
…llision

Python resolved `from mcp.server.fastmcp import FastMCP` against the
local `mcp/` directory instead of the installed `mcp` package, causing
a silent import failure and all MCP-dependent tests to be skipped.

- Rename `mcp/` → `mcp_servers/` (7 files) to eliminate the shadowing
- Update ruff per-file-ignores glob in pyproject.toml
- Update ruff check, ruff format, and bandit scan paths in ci.yml
- Fix `_MCP_SERVER_PATH` in tests/test_correlator.py to reflect new dir
- Fix aggregate test patches to use `patch.object(mcp_server, ...)` on
  the server module's namespace instead of the base module, since
  `query_opensearch` is imported by name into the server module
Adds a reusable `delete_ip_from_filter(path, ip)` function that removes
must_not clauses matching an IP from a YAML filter file.  Handles both
single-value `term` and multi-value `terms` clause shapes; for `terms`
with multiple IPs, only the matching IP is removed and the clause is kept
with the remainder.

Raises FileNotFoundError if the filter file is absent and ValueError if
no clauses matched the given IP, so callers get actionable errors.
…lete_fp_filter tools

Exposes three new MCP tools for read/delete access to FP filter files:

- list_filter_categories: returns the full category → subcategory map from
  categories.yaml so callers know valid inputs before creating filters
- list_fp_filters: returns a summary of all filter files, or all files in a
  category, or the full clause list for a specific category/subcategory
- delete_fp_filter: removes every must_not clause matching an IP address from
  a given filter file, delegating to the new delete_ip_from_filter backend

Also adds load_categories and load_filter_file to the fp_manager import set
and covers all three tools with 9 regression tests in
tests/test_fp_filter_tools.py.
…and weird_name filters

Both modules previously used a hard `term` clause, which required exact
matches. Consumers now pass glob patterns (e.g. "Scan::*", "bad_*") and
the module selects `wildcard` vs `term` automatically based on whether
the value contains `*` or `?`.
…y dispatch

Covers both NoticeModule and WeirdModule: exact strings produce a
`term` clause, patterns containing `*` or `?` produce a `wildcard`
clause, and an empty params dict returns no clauses.
…uncated flag on all search results

- _search_result() helper now returns a `truncated: bool` field on
  every search response so callers can tell whether the result set was
  capped by the limit parameter, without inspecting the count
- bulk_enrich_ips() enriches a list of IPs concurrently (up to 5
  workers), skips RFC-1918 addresses, and preserves input ordering
- count() hits the _count endpoint so callers can check event volume
  or confirm an IP is active without paying the cost of fetching
  records; supports the same time/sensor/IP filters as search tools
- Updated search_notice and search_weird docstrings to advertise
  wildcard support added to the query layer
…atch

The zeek.notice.note and rule.name fields are mapped as text on rolled-over
write indices, so a term query for "Scan::Port_Scan" never matches — the
analyzer tokenizes it to ["scan","port_scan"]. Switching to the .keyword
subfield makes term and wildcard queries match correctly across indices.

The get_notice_summary aggregation was unaffected because it reads _source
via a Painless script, bypassing the mapping.
Old indices map certain Zeek fields (e.g. zeek.notice.note,
rule.name, event.dataset) as `keyword`. The rolled-over write index
maps the same fields as `text` + `.keyword` subfield. A native
`terms` aggregation sent over the wildcard index pattern hits both
shards and fails on the write-index shard.

Introduce `source_terms_script(field)` in `src/querier/builder.py`.
It returns a Painless script object that walks `_source` segment by
segment, making the bucketing immune to mapping type. Swap every
affected `"field": ...` call site in `apps/opensearch_web/app.py`
and `mcp_servers/opensearch/server.py` to use the script form.
Re-export through `src/querier/zeek_modules/base.py` so existing
importers continue to work without touching their import paths.

Three tests in `tests/test_zeek_base.py` cover single fields,
dotted paths, and list-valued fields.
Script-based terms aggregations (_source Painless scripts) run
noticeably slower than doc-value aggs. The previous 30s ceiling
caused timeouts on busy clusters. 60s gives enough headroom for
both sync and async clients without being unreasonably permissive.
Bump version to 1.1.0, add CHANGELOG.md (Keep-a-Changelog format,
synthesised from 103 commits since v1.0.0), and wire up discovery
paths so the log is easy to find:

- README.md: one-line pointer to CHANGELOG.md under Contributing
  and security section
- Hub footer: version string now links to CHANGELOG.md on GitHub

Notable breaking-ish changes called out in the changelog that future
bisects should be aware of:
- mcp/ directory renamed to mcp_servers/
- mantis_web app renamed to threat_model
Reconciles dev with main's 4 Dependabot bumps and the v1.0.0 squash-merge
(commit 81909ca). Conflicts in 9 files all resolved as "ours" — dev's
content post-dates the v1.0.0 squash snapshot in every case.

Conflicts resolved (--ours):
- pyproject.toml (version bump 1.0.0 -> 1.1.0)
- src/querier/zeek_modules/base.py (modular split refactor)
- src/querier/filter_loader.py
- apps/dashboard_web/opensearch/{__init__,aggregations}.py
- apps/dashboard_web/opensearch/templates/opensearch/section.html
- apps/opensearch_web/{queries.py,templates/base.html}
- apps/hub/templates/index.html (CHANGELOG link in footer)
Comment thread src/querier/fp_manager.py Dismissed
Comment thread apps/opensearch_web/app.py Fixed
Comment thread apps/threat_model/app.py Fixed
Three pre-existing test failures and seven CodeQL alerts were surfaced
when the v1.1.0 PR ran on a clean CI environment.

Tests:
- src/profiler/fleet_scanner.py: guard against query_opensearch returning
  None (test_null_response was passing None and hitting AttributeError on
  raw.get)
- tests/test_correlator.py: _base_patches now also patches
  profile_public_ip; the two public-IP investigate tests previously hit a
  real env-var check because only profile_device was mocked

Security (CodeQL):
- src/querier/fp_manager.py: validate category/subcategory against
  [A-Za-z0-9_-]+ and verify the resolved path stays inside FILTERS_DIR;
  closes 5 py/path-injection alerts on lines 78/80/85/86/144
- apps/opensearch_web/app.py & apps/threat_model/app.py: log exception
  details server-side and return a generic error string; closes 2
  py/stack-trace-exposure alerts on the device-profile partial handlers
Comment thread src/querier/fp_manager.py Fixed
Comment thread src/querier/fp_manager.py Fixed
Comment thread src/querier/fp_manager.py Fixed
Comment thread src/querier/fp_manager.py Fixed
- Extract the conditional href into a Jinja {% set %} block so djlint
  no longer sees orphan <a> tags (H025 x2)
- Use spaceless {%- if -%} inside the hx-get attribute (T028)
The first fix put the sanitizer in filter_file_path() and assumed callers
would always go through it. CodeQL's interprocedural taint analysis didn't
follow that and re-flagged load/write/delete/append as path-injection sinks
on lines 94/96/101/102/160.

Move the guard into a new _assert_inside_filters_dir() and call it at the
top of every file-I/O function. The guard is now local to each sink, so
CodeQL recognises it as a taint barrier.

Tests use pytest's tmp_path which is outside FILTERS_DIR; an autouse
monkeypatch fixture in test_fp_filter_tools.py now points FILTERS_DIR at
tmp_path so the guard accepts test paths.
Comment thread src/querier/fp_manager.py Dismissed
Comment thread src/querier/fp_manager.py Dismissed
Comment thread src/querier/fp_manager.py Dismissed
Comment thread src/querier/fp_manager.py Dismissed
CodeQL didn't recognise os.path.commonpath() comparison as a sanitizer
and kept flagging the same 5 path-injection sinks. Switch to the
canonical Path.resolve() + is_relative_to() pattern, which CodeQL's
taint analysis recognises.
Comment thread src/querier/fp_manager.py Dismissed
@liamadale liamadale merged commit fbcf594 into main Jun 14, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants