[medium] TTAS spinlock has no death-handling: a process killed while holding write_lock deadlocks all others forever

**Severity:** 🟡 medium  •  **Category:** concurrency
**Location:** `src/shm/lock.rs` : 80-114

## What's wrong
write_lock() is a plain TTAS spinlock stored in shared memory with no robustness. If a process is killed (SIGKILL, crash, panic across the FFI boundary, OOM) between write_lock() and write_unlock(), write_lock stays 1 and seq stays odd forever. Every other process: write_lock() spins forever, and read_begin() spins forever because it waits for seq to become even (seq & 1 == 0). There is no timeout, no owner PID, no robust-mutex/EOWNERDEAD recovery, no lock generation. The entire cross-process cache is permanently wedged with no recovery short of deleting the shm file. insert()/clear() are not panic-safe either: any panic between write_lock and write_unlock (e.g. a bug in serde or pointer math) leaks the lock with no RAII guard.

## Trigger
kill -9 a process between acquiring the write lock and releasing it (e.g. during insert). All other processes hang in read_begin/write_lock indefinitely.

## Suggested fix
Use a robust mechanism: store the owner PID + a lock generation, detect dead owners (kill(pid,0)==ESRCH) and steal/recover, or bound the spin with a deadline that triggers recovery, or use pthread robust mutexes (PTHREAD_MUTEX_ROBUST/EOWNERDEAD). Wrap the write section in an RAII guard so a Rust panic still releases the lock and restores seq parity.

<details><summary>Adversarial verification note</summary>

Confirmed in the real code. src/shm/lock.rs:82-102 implements write_lock() as a bare TTAS spinlock (compare_exchange_weak on write_lock_ptr) followed by bumping seq to odd; write_unlock() (106-114) bumps seq back to even and stores 0. There is no owner PID, generation, robust-mutex (EOWNERDEAD), timeout, or RAII guard — these atomics live in mmap'd shared memory (region.rs:207-208 hands out ShmSeqLock from the lock mmap). read_begin() (60-66) spins forever while seq & 1 != 0 with std::hint::spin_loop() and no exit; write_lock() likewise spins indefinitely. So if a process is SIGKILLed/crashes between write_lock() and write_unlock(), write_lock stays 1 and seq stays odd in shared memory permanently: every other process hangs forever in read_begin()/write_lock(). insert() (mod.rs:313-315) and clear() (mod.rs:462-464) call write_lock()/write_unlock() directly with no guard, so a panic in insert_inner (e.g. serde/pointer-math bug) also leaks the lock with no parity restoration. grep over src/shm confirms zero occurrences of getpid/kill/ESRCH/robust/EOWNERDEAD/generation/owner/deadline/timeout/recover; the one 'recover' in region.rs:188 only handles open errors or parameter mismatch, not a wedged lock. The evidence snippet in the finding accurately paraphrases the code. The bug is genuine. Severity 'medium' is fair: it is a real cross-process liveness defect with no recovery short of deleting the shm file, but it requires a process death within a narrow critical-section window (uncommon under normal operation), and normal steady-state usage is unaffected.

</details>

---
_Filed from a multi-agent code review (finder → adversarial verification → synthesis). Confirmed real after a skeptic re-read the code._

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[medium] TTAS spinlock has no death-handling: a process killed while holding write_lock deadlocks all others forever #38

What's wrong

Trigger

Suggested fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[medium] TTAS spinlock has no death-handling: a process killed while holding write_lock deadlocks all others forever #38

Description

What's wrong

Trigger

Suggested fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions