Skip to content

ctx.fork() produces incoherent output on Vulkan/Windows — CopyD2D in aux_server.cpp round-trips through ggml_backend_tensor_get which doesn't support partial tensor reads on ggml-Vulkan #418

Description

@zatchbell1311-wq

Environment

  • OS: Windows 11
  • GPU: NVIDIA RTX 2050
  • Backend: ggml-Vulkan (portable driver)
  • Model: Qwen3-0.6B

Symptom

Calling ctx.fork() in an inferlet causes all subsequent generation from the forked context to produce incoherent/garbage output, even with a single fork and no concurrency. Plain Context::new() without fork works correctly with the same model.

Root Cause

AuxServer::handle_command_ in driver/portable/src/aux_server.cpp (line 294) implements CopyD2D by round-tripping each KV page through a host buffer via ggml_backend_tensor_get / ggml_backend_tensor_set with a non-zero byte offset (src_off = pair.src * page_bytes). On ggml-Vulkan, partial tensor reads at non-zero offsets appear to return zeros or garbage, silently corrupting the copied KV pages. The comment in the code already acknowledges this is not universally supported across backends.

This is consistent with ctx.fork() working correctly on Metal — ggml_backend_tensor_get with offset works on Metal/CUDA but not on Vulkan.

Workaround

Avoid ctx.fork() entirely — create a fresh Context::new() per branch and replay prior turns as text. Verified working at num_branches=2 (8 concurrent leaves), 512 tokens/step. Branch: fix/tot-fork-corruption.

Suggested Fix

For the Vulkan backend, either:

  • (a) Fall back to CopyD2H + CopyH2D (round-trip through CPU swap pool) instead of the direct CopyD2D path
  • (b) Fix ggml_backend_tensor_get offset support in ggml-Vulkan upstream

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions