perf(tui): fix CPU lockup and O(N) scans during waiting/running states by mattimustang · Pull Request #494 · usestrix/strix

mattimustang · 2026-05-22T00:14:35Z

Summary

Fixes 100% CPU utilisation that occurred during the waiting state between agent runs, caused by redundant 60fps re-renders and O(N) scans over all tool executions on every frame.

Fix O(N) scans across all tool executions by indexing lookups through agent_data["tool_executions"] instead of scanning the full tracer.tool_executions dict
Eliminate redundant 60fps renders during waiting state by skipping agent status display updates when the dot animation timer is inactive
Cache completed event renders to avoid re-rendering unchanged content on every refresh during running state

Details

During the waiting state the TUI was re-rendering all events at 60fps even though nothing changed, and several helper methods (_agent_has_real_activity, _agent_vulnerability_count, _get_agent_name_for_vulnerability, _gather_agent_events) each did a full O(N) scan over tracer.tool_executions to find events belonging to a given agent. Combined, this drove CPU to 100% while the app appeared idle. These methods now use the per-agent tool_executions index for O(1) lookup, and status display updates are skipped entirely when no animation is running.

Test plan

Run a scan with multiple agents and verify CPU usage drops to near-zero during the waiting state
Confirm agent events display correctly in the chat view during and after a run
Confirm vulnerability counts display correctly

* feat: add HTTP request smuggling skill Add a new vulnerability skill covering HTTP request smuggling (HRS) across CL.TE, TE.CL, H2.CL, and H2.TE desync variants. HRS is absent from the existing skill set despite being a distinct, high-impact vulnerability class frequently present in any architecture using a reverse proxy or CDN in front of an application server. Coverage: - CL.TE: front-end uses Content-Length, back-end uses Transfer-Encoding - TE.CL: front-end uses Transfer-Encoding, back-end uses Content-Length - H2.CL: HTTP/2 front-end downgrades to HTTP/1.1 with injected Content-Length - H2.TE: Transfer-Encoding header injection through HTTP/2 desync - Transfer-Encoding obfuscation techniques (tab, space, duplicate, xchunked) - Front-end security control bypass via smuggled prefix - Cross-user request capture for session token theft - Response queue poisoning and WebSocket handshake hijacking - Timing-based and differential response detection methodology - HTTP/2 specific probing techniques Includes raw HTTP examples for each variant, step-by-step testing methodology, exploitation PoCs, false-positive conditions, and infrastructure topology guidance. * fix: correct TE.CL probe, pseudo-header terminology, PoC Content-Length values, \x20 representation Four reviewer findings addressed: P1 — TE.CL timing-probe description inverted: previous text said 'Content-Length set to fewer bytes than the chunk content' which describes socket-poisoning behavior (differential response), not a timeout. Corrected to: send a complete chunked body with CL set to MORE bytes than provided so the back-end waits for data that never arrives. Also corrected Testing Methodology step 3 to match. P2 — pseudo-header terminology: 'content-length' is a regular HTTP/2 header, not a pseudo-header (pseudo-headers are exclusively :method, :path, :authority, :scheme). Fixed the H2.CL explanation (line 75), HTTP/2-specific detection bullet, and Pro Tip usestrix#4 which referred to ':content-length pseudo-header'. P2 — PoC Content-Length values: outer Content-Length in the bypass PoC corrected from 116 to 100 (actual byte count of the body shown); capture PoC corrected from 129 to 120. P2 — \x20 representation: replaced the \x20 escape sequence in the code block (which renders as a literal four-character string, not a space byte) with an explanatory comment and actual whitespace characters so the intent is unambiguous. * Update strix/skills/vulnerabilities/http_request_smuggling.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

…state Three hot methods were scanning the entire tool_executions dict on every tick instead of using the per-agent index already maintained by the Tracer. This made CPU cost proportional to total accumulated tool executions, which is worst exactly when agents finish and enter waiting/stopped state. - _agent_has_real_activity: was O(all_tool_executions) at 60ms; now uses agents[agent_id]["tool_executions"] index - _agent_vulnerability_count: same full scan per agent per 350ms tick; now scoped to the agent's own executions - _gather_agent_events: same full scan on every 350ms tick, even before the cache check that would discard the result; now scoped per agent Also stop calling _update_agent_status_display from _animate_dots when the selected agent is in "waiting" state. The waiting display is static text ("Send message to resume") that never changes until the user acts, but the 60ms timer was pushing Textual widget updates for it at 16fps anyway. The 350ms _update_ui_from_tracer call is sufficient to render the waiting state.

…nning state Three more performance issues in the running state hot path: Per-event render cache in _get_rendered_events_content: every 350ms tick during active streaming caused a full re-render of all events in the conversation — every chat message through AgentMessageRenderer (including Pygments syntax highlighting for code blocks) and every tool event. Chat messages and completed/failed tool events are now cached by (event_id, status) and only re-rendered when their status changes. Running tool events are re-rendered each tick as their content may still update. Skip duplicate _update_agent_status_display in _update_ui_from_tracer when the dot animation timer is active: _animate_dots (60ms) already calls it for "running" agents, so the unconditional call from _update_ui_from_tracer (350ms) was redundant, doubling the widget update rate during active scans. Fix _get_agent_name_for_vulnerability to use per-agent tool execution index instead of scanning all tool_executions, consistent with the other O(N) scan fixes from the previous commit.

greptile-apps · 2026-05-22T00:21:56Z

Greptile Summary

This PR fixes excessive CPU usage in the TUI during idle/waiting states by replacing four O(N) linear scans over tracer.tool_executions with O(1) per-agent indexed lookups, and adding a completed-event render cache to avoid re-rendering unchanged tool/chat content on every tick.

_agent_has_real_activity, _agent_vulnerability_count, _get_agent_name_for_vulnerability, and _gather_agent_events now iterate only the per-agent tool_executions list instead of the entire global dict.
_update_agent_status_display() is skipped in the 0.35 s refresh tick when the dot animation timer is inactive, and the sweep-frame counter advances only for "running" (not "waiting") agents.
A new _event_render_cache stores completed/failed/error tool renders and finalized chat renders, keyed by event_id (and event_id + status for tools), cleared on agent switch.

Confidence Score: 4/5

Safe to merge; the core logic changes are correct and well-scoped.

The tool-execution indexing is correct and consistent across all four refactored methods, and the render-cache invalidation on agent switch is properly wired. A single leftover O(N) scan over chat_messages in _gather_agent_events means the fix is incomplete for long scans with heavy chat traffic, but it does not introduce any regression.

strix/interface/tui.py — specifically _gather_agent_events where the chat_messages linear scan was not optimized alongside tool_executions.

Important Files Changed

Filename	Overview
strix/interface/tui.py	Four helper methods converted from O(N) global scans to O(1) indexed lookups via agent_data["tool_executions"]; event render cache added for completed tool/chat events; _update_agent_status_display() call gated behind animation-timer check. One residual O(N) scan over chat_messages remains in _gather_agent_events.
strix/skills/vulnerabilities/http_request_smuggling.md	New skill knowledge file documenting HTTP request smuggling detection/exploitation techniques; no code changes, documentation only.

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
strix/interface/tui.py:1472-1482
**O(N) chat_messages scan left unoptimized**

`_gather_agent_events` now uses O(1) indexed lookups for tool executions, but `chat_messages` is still filtered with a linear scan (`for msg in self.tracer.chat_messages if msg.get("agent_id") == agent_id`). This function is called on every refresh tick, so a long-running scan with many messages across multiple agents will still exhibit the same per-frame O(N) scan cost, just for a different collection. Consider adding a per-agent index to `chat_messages` in the tracer (similar to `tool_executions`) to make this O(1) as well.

_{Reviews (1): Last reviewed commit: "perf(tui): cache event renders and elimi..." | Re-trigger Greptile}

mattimustang · 2026-05-22T02:45:29Z

Fixed in d0cbaec.

Added chat_messages_by_agent: dict[str, list[dict[str, Any]]] to Tracer and populate it at write time (setdefault(agent_id, []).append(message_data)). _gather_agent_events now does an O(1) dict lookup instead of a full scan.

sandiyochristan and others added 3 commits May 20, 2026 21:45

greptile-apps Bot reviewed May 22, 2026

View reviewed changes

Comment thread strix/interface/tui.py

perf(tui): index chat_messages by agent to eliminate O(N) scan per frame

d0cbaec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(tui): fix CPU lockup and O(N) scans during waiting/running states#494

perf(tui): fix CPU lockup and O(N) scans during waiting/running states#494
mattimustang wants to merge 4 commits into
usestrix:mainfrom
mattimustang:fix/tui-cpu-lockup-clean

mattimustang commented May 22, 2026

Uh oh!

greptile-apps Bot commented May 22, 2026

Uh oh!

Uh oh!

mattimustang commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mattimustang commented May 22, 2026

Summary

Details

Test plan

Uh oh!

greptile-apps Bot commented May 22, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Uh oh!

Uh oh!

mattimustang commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants