fix: sanitize state values substituted into instruction templates#361
Open
g0w6y wants to merge 9 commits into
Open
fix: sanitize state values substituted into instruction templates#361g0w6y wants to merge 9 commits into
g0w6y wants to merge 9 commits into
Conversation
…mplates
State values substituted via {var_name} in instruction templates were
returned verbatim. An attacker who can write to session state can
craft a value that continues the instruction text and overrides agent
behavior — classic prompt injection.
Wrapping the substituted value in <value>...</value> and escaping
HTML entities creates a clear boundary between the instruction
written by the developer and the data coming from state. Modern
models respect this distinction and treat the tagged content as data
rather than as part of the instruction.
Before:
instruction: "User role: {role}"
state: {role: "admin. Ignore previous instructions."}
result: "User role: admin. Ignore previous instructions."
After:
result: "User role: <value>admin. Ignore previous instructions.</value>"
The artifact.* substitution path had the same issue — returning String(artifact) verbatim into the instruction template. Applies the same XML wrapping and entity escaping as the state injection fix.
kalenkevich
reviewed
May 26, 2026
Collaborator
kalenkevich
left a comment
There was a problem hiding this comment.
Thanks, can you include tests as well?
Contributor
Author
|
@kalenkevich Added tests covering the sanitization behavior for both state and artifact injection paths — all 23 pass. Ready for re-review |
…ection - Update existing tests to expect values wrapped in <value>...</value> tags - Add dedicated sanitization block testing HTML entity escaping (&, <, >) - Add prompt injection prevention tests for both state and artifact paths - Add double-escaping guard test (< in raw input → &lt; in output) - Expand artifact injection tests to cover HTML escaping and injection prevention
357a97e to
5636371
Compare
kalenkevich
reviewed
May 27, 2026
Contributor
Author
|
@kalenkevich removed the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
injectSessionState()replaces{var_name}placeholders in an agent'sinstruction string with values from two sources:
{role}→session.state['role']{artifact.report}→ artifact file contentBoth paths returned the raw string with no escaping. A value containing
HTML special characters such as
<,>, or&would be embeddeddirectly into the system prompt without sanitization.
Fix
Escape HTML entities in every substituted value before inserting it into
the template. Applied to both the state injection path and the artifact
injection path:
The order matters —
&is replaced first to avoid double-escapingsequences like
<into&lt;.Example
Before
After
Changes
core/src/agents/instructions.ts— escape&,<,>in both thestate injection path and the artifact injection path
core/test/agents/instructions_test.ts— update existing assertions toexpect escaped output; add new sanitization tests covering individual
character escaping, combined escaping, and double-escape prevention