Skip to content

Harden packed_next_token_logprobs against zero-length segments #83

Description

@adohe

Finding

The vectorized packed_next_token_logprobs (keep[cu_seqlens[1:] - 1] = False) is correct and faster only while every sequence length >= 1, an invariant established by _pack_train_data's lengths.clamp(min=1, ...). A zero-length segment would make cu_seqlens[i+1]-1 point at the previous sequence and silently mask a valid action site (the old loop guarded with if length <= 1: continue).

Acceptance

  • Add a cheap shape/invariant assertion, or document the cross-module coupling at both ends.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/engineIssues or PRs related to the engine runtime (workers, TP, rollout, training)kind/cleanupCategorizes issue or PR as related to cleaning up code, process, or technical debt

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions