perf(tui): fix CPU lockup and O(N) scans during waiting/running states#494
perf(tui): fix CPU lockup and O(N) scans during waiting/running states#494mattimustang wants to merge 4 commits into
Conversation
* feat: add HTTP request smuggling skill Add a new vulnerability skill covering HTTP request smuggling (HRS) across CL.TE, TE.CL, H2.CL, and H2.TE desync variants. HRS is absent from the existing skill set despite being a distinct, high-impact vulnerability class frequently present in any architecture using a reverse proxy or CDN in front of an application server. Coverage: - CL.TE: front-end uses Content-Length, back-end uses Transfer-Encoding - TE.CL: front-end uses Transfer-Encoding, back-end uses Content-Length - H2.CL: HTTP/2 front-end downgrades to HTTP/1.1 with injected Content-Length - H2.TE: Transfer-Encoding header injection through HTTP/2 desync - Transfer-Encoding obfuscation techniques (tab, space, duplicate, xchunked) - Front-end security control bypass via smuggled prefix - Cross-user request capture for session token theft - Response queue poisoning and WebSocket handshake hijacking - Timing-based and differential response detection methodology - HTTP/2 specific probing techniques Includes raw HTTP examples for each variant, step-by-step testing methodology, exploitation PoCs, false-positive conditions, and infrastructure topology guidance. * fix: correct TE.CL probe, pseudo-header terminology, PoC Content-Length values, \x20 representation Four reviewer findings addressed: P1 — TE.CL timing-probe description inverted: previous text said 'Content-Length set to fewer bytes than the chunk content' which describes socket-poisoning behavior (differential response), not a timeout. Corrected to: send a complete chunked body with CL set to MORE bytes than provided so the back-end waits for data that never arrives. Also corrected Testing Methodology step 3 to match. P2 — pseudo-header terminology: 'content-length' is a regular HTTP/2 header, not a pseudo-header (pseudo-headers are exclusively :method, :path, :authority, :scheme). Fixed the H2.CL explanation (line 75), HTTP/2-specific detection bullet, and Pro Tip usestrix#4 which referred to ':content-length pseudo-header'. P2 — PoC Content-Length values: outer Content-Length in the bypass PoC corrected from 116 to 100 (actual byte count of the body shown); capture PoC corrected from 129 to 120. P2 — \x20 representation: replaced the \x20 escape sequence in the code block (which renders as a literal four-character string, not a space byte) with an explanatory comment and actual whitespace characters so the intent is unambiguous. * Update strix/skills/vulnerabilities/http_request_smuggling.md Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
…state
Three hot methods were scanning the entire tool_executions dict on every
tick instead of using the per-agent index already maintained by the Tracer.
This made CPU cost proportional to total accumulated tool executions, which
is worst exactly when agents finish and enter waiting/stopped state.
- _agent_has_real_activity: was O(all_tool_executions) at 60ms; now uses
agents[agent_id]["tool_executions"] index
- _agent_vulnerability_count: same full scan per agent per 350ms tick;
now scoped to the agent's own executions
- _gather_agent_events: same full scan on every 350ms tick, even before
the cache check that would discard the result; now scoped per agent
Also stop calling _update_agent_status_display from _animate_dots when the
selected agent is in "waiting" state. The waiting display is static text
("Send message to resume") that never changes until the user acts, but the
60ms timer was pushing Textual widget updates for it at 16fps anyway. The
350ms _update_ui_from_tracer call is sufficient to render the waiting state.
…nning state Three more performance issues in the running state hot path: Per-event render cache in _get_rendered_events_content: every 350ms tick during active streaming caused a full re-render of all events in the conversation — every chat message through AgentMessageRenderer (including Pygments syntax highlighting for code blocks) and every tool event. Chat messages and completed/failed tool events are now cached by (event_id, status) and only re-rendered when their status changes. Running tool events are re-rendered each tick as their content may still update. Skip duplicate _update_agent_status_display in _update_ui_from_tracer when the dot animation timer is active: _animate_dots (60ms) already calls it for "running" agents, so the unconditional call from _update_ui_from_tracer (350ms) was redundant, doubling the widget update rate during active scans. Fix _get_agent_name_for_vulnerability to use per-agent tool execution index instead of scanning all tool_executions, consistent with the other O(N) scan fixes from the previous commit.
Greptile SummaryThis PR fixes excessive CPU usage in the TUI during idle/waiting states by replacing four O(N) linear scans over
Confidence Score: 4/5Safe to merge; the core logic changes are correct and well-scoped. The tool-execution indexing is correct and consistent across all four refactored methods, and the render-cache invalidation on agent switch is properly wired. A single leftover O(N) scan over chat_messages in _gather_agent_events means the fix is incomplete for long scans with heavy chat traffic, but it does not introduce any regression. strix/interface/tui.py — specifically _gather_agent_events where the chat_messages linear scan was not optimized alongside tool_executions. Important Files Changed
Prompt To Fix All With AIFix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
strix/interface/tui.py:1472-1482
**O(N) chat_messages scan left unoptimized**
`_gather_agent_events` now uses O(1) indexed lookups for tool executions, but `chat_messages` is still filtered with a linear scan (`for msg in self.tracer.chat_messages if msg.get("agent_id") == agent_id`). This function is called on every refresh tick, so a long-running scan with many messages across multiple agents will still exhibit the same per-frame O(N) scan cost, just for a different collection. Consider adding a per-agent index to `chat_messages` in the tracer (similar to `tool_executions`) to make this O(1) as well.
Reviews (1): Last reviewed commit: "perf(tui): cache event renders and elimi..." | Re-trigger Greptile |
|
Fixed in d0cbaec. Added |
Summary
Fixes 100% CPU utilisation that occurred during the waiting state between agent runs, caused by redundant 60fps re-renders and O(N) scans over all tool executions on every frame.
agent_data["tool_executions"]instead of scanning the fulltracer.tool_executionsdictDetails
During the waiting state the TUI was re-rendering all events at 60fps even though nothing changed, and several helper methods (
_agent_has_real_activity,_agent_vulnerability_count,_get_agent_name_for_vulnerability,_gather_agent_events) each did a full O(N) scan overtracer.tool_executionsto find events belonging to a given agent. Combined, this drove CPU to 100% while the app appeared idle. These methods now use the per-agenttool_executionsindex for O(1) lookup, and status display updates are skipped entirely when no animation is running.Test plan