Skip to content

resolve_placeholder PATCH fails 400 on long messages with mentions, leaving orphan placeholders #23

@brandwe

Description

@brandwe

Symptom

When resolve_placeholder(..., mode="edit") is called with a long, multi-paragraph final message that includes mentions, the PATCH to Graph's /chats/{id}/messages/{message_id} endpoint fails with HTTP 400 (Bad Request — client error, NOT a service hiccup). The tool's fallback path correctly catches this and posts the final content as a NEW message, returning {"mode": "fallback_new"}.

The fallback works in the sense that the human sees the content. But the side effect is that the original placeholder (a short italic "thinking..." line) remains in the chat as an orphan above the new message, requiring a manual delete_teams_message cleanup.

Today's evidence

Two failures in a single session, both on long mention-bearing resolves:

22:41:39 WARNING entrabot.tools.teams: PATCH placeholder 1781044889769 failed (400) — falling back to new message
22:56:33 WARNING entrabot.tools.teams: PATCH placeholder 1781045782045 failed (400) — falling back to new message

Earlier resolve_placeholder calls in the same session — shorter content, mentions OR no-mentions — all succeeded with mode="edit". So the failures correlate strongly with the long + mentions combination, not with either factor alone or with general Graph instability. (There were Graph 502s today on poll reads and one send_teams_message POST, but those are a different shape and a different endpoint.)

Hypothesis

Graph's PATCH endpoint for chat messages has stricter validation than POST for at least one of:

  1. mentions array shape. The <at id="N"> indices in the patched body must match entries in the mentions array, and our resolve-side build may not be regenerating the indices correctly when the new content fully replaces the placeholder body.
  2. Body length crossing a threshold. PATCH may have a smaller max-body limit than POST.
  3. HTML content validation differing between endpoints — certain tags, entities, or nested structures may parse on POST but fail on PATCH.

Without the 400 response body it's hard to nail which. The current logging captures only the status code; the response payload (which Graph typically uses to explain the validation failure) is dropped.

What to investigate

  1. Capture the 400 response body in entrabot.tools.teams when the PATCH fails. The body almost always names the specific field — body.content, mentions[0].id, or similar — that Graph rejected. Without it we're guessing.
  2. Compare the PATCH payload to the equivalent POST payload for the same final content. If they differ structurally (mention shape, escaping, anything), the PATCH builder is the suspect.
  3. Bisect the failure: does PATCH succeed if we (a) drop the mentions, (b) shorten the body, (c) drop the <br> paragraph separators? Each isolates one variable.

Proposed fixes (in order of preference)

  1. Identify and fix the actual PATCH payload defect. This is the right fix; the fallback_new path is a workaround, not a solution.
  2. If the PATCH endpoint is genuinely more restrictive and we can't satisfy it for some payloads: when the fallback fires, automatically delete_teams_message on the orphaned placeholder so the chat doesn't accumulate cruft. This is one extra Graph call per fallback, no human-facing degradation.
  3. Log the response body verbatim on PATCH failure regardless of whether we fix the root cause, so future debugging has the receipt.

Repro

Send a resolve_placeholder with mode="edit" whose final_message is:

  • Four or more <p> paragraphs separated by <br> tags
  • Includes at least one <at id="N">Name</at> mention with matching entry in mentions array
  • Total content roughly 800+ characters

Then check ~/.entrabot/logs/entrabot.log for the PATCH warning. The tool returns mode="fallback_new"; the chat shows orphan placeholder + new final message.

Related

  • src/entrabot/tools/teams.pyresolve_placeholder and its PATCH path
  • ~/.entrabot/logs/entrabot.log — the warning entries
  • Today's session is the canonical evidence — two failures within 15 minutes of each other, identical shape
  • Independent of (but related to) the broader Graph 502 polling failures, which ARE service-side and a separate matter

Out of scope

  • Fixing the Graph 502s on poll reads (those are service-side; nothing actionable on our end beyond retry, which we already do).
  • Replacing the PATCH path with a delete-then-post pattern by default (worse UX even when PATCH works).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions