Harden internal LLM prompt boundaries by mylukin · Pull Request #454 · EasyMetaAu/helm-api

mylukin · 2026-07-04T16:14:26Z

Summary

Add shared XML prompt-boundary helpers for untrusted text and JSON payloads.
Wrap Helm-owned internal LLM inputs in explicit XML sections, including classifier eval user text and memory LLM messages/observations.
Add regression coverage for XML breakout attempts and preserve the existing memory system-reminder filtering behavior.
Record the implementation decision in implementation-notes.md.

Validation

corepack pnpm exec vitest run apps/gateway/src/routes/classify.test.ts apps/gateway/src/memory-llm.test.ts
corepack pnpm typecheck
corepack pnpm lint
corepack pnpm build
git diff --check

Note

The inspected Feishu reply-gate prompt is an external caller request to /v1/chat/completions, so Helm cannot automatically infer trusted and untrusted sections inside that business prompt. This PR hardens Helm-owned internal LLM prompts; the caller that constructs the Feishu gate prompt should also XML-wrap its policy, runtime context, history, latest message, and output contract sections.

Co-Authored-By: Codex <noreply@openai.com>

Harden internal LLM prompt boundaries

bb925ab

Co-Authored-By: Codex <noreply@openai.com>

mylukin force-pushed the codex/internal-llm-xml-boundaries branch from 3354c82 to bb925ab Compare July 4, 2026 16:22

mylukin merged commit fc38c14 into main Jul 4, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Harden internal LLM prompt boundaries#454

Harden internal LLM prompt boundaries#454
mylukin merged 1 commit into
mainfrom
codex/internal-llm-xml-boundaries

mylukin commented Jul 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mylukin commented Jul 4, 2026

Summary

Validation

Note

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant