Skip to content

proc: fix panic on follow-exec process exit#4383

Merged
aarzilli merged 2 commits into
go-delve:masterfrom
derekparker:fix-follow-exec-double-postexit
Jul 3, 2026
Merged

proc: fix panic on follow-exec process exit#4383
aarzilli merged 2 commits into
go-delve:masterfrom
derekparker:fix-follow-exec-double-postexit

Conversation

@derekparker

@derekparker derekparker commented Jun 30, 2026

Copy link
Copy Markdown
Member
  • Make postExit idempotent to prevent double-release of the ptraceThread refcount
  • Flush instruction cache after writing memory on Windows to fix intermittent missed breakpoints on ARM64

Fix 1: postExit double-release

In trapWaitInternal, a process leader can have postExit called via the ESRCH path (when resumeWithSig fails because the thread is gone), and then called again when the actual exit event arrives. The double-release drops the ptraceThread refcount too low, closing ptraceChan while another process in the follow-exec group still needs it. The next ptrace operation on that process panics with "send on closed channel".

Uses atomic Swap instead of Store so the second call is a no-op.

Observed on linux/riscv64 with Go 1.27rc1 as a panic in TestLaunchWithFollowExec.

Fix 2: FlushInstructionCache on Windows

On ARM64 the instruction and data caches are not coherent. WriteProcessMemory only updates the data cache, so breakpoint instructions written by the debugger may not be visible to the instruction fetch unit. This causes intermittent missed breakpoints on Windows/ARM64 — the CPU executes the original instruction from stale I-cache instead of the BRK that was written.

Calls FlushInstructionCache after every WriteProcessMemory to ensure I-cache coherency. On x86/amd64 this is a documented no-op.

Fixes #3183

Make postExit idempotent to prevent double-release of the ptraceThread
refcount. In trapWaitInternal, a process leader can have postExit called
via the ESRCH path (when resumeWithSig fails because the thread is
gone), and then called again when the actual exit event arrives. The
double-release drops the ptraceThread refcount too low, closing
ptraceChan while another process in the follow-exec group still needs
it. The next ptrace operation on that process panics with "send on
closed channel".

Use atomic Swap instead of Store so the second call is a no-op.
@derekparker derekparker marked this pull request as draft June 30, 2026 17:45
On ARM64 the instruction and data caches are not coherent.
WriteProcessMemory only updates the data cache, so breakpoint
instructions written by the debugger may not be visible to the
instruction fetch unit. This causes intermittent missed breakpoints
on Windows/ARM64 — the CPU executes the original instruction from
stale I-cache instead of the BRK we wrote.

Call FlushInstructionCache after every WriteProcessMemory to ensure
I-cache coherency. On x86/amd64 this is a no-op.

Fixes go-delve#3183
@derekparker derekparker marked this pull request as ready for review July 1, 2026 14:37

@aarzilli aarzilli left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aarzilli aarzilli merged commit 0d1a62e into go-delve:master Jul 3, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Debugger stops at breakpoint in a loop only once (Windows Arm64)

2 participants