Clean up native attention nits (cc threshold comment, duplicate reason, stale docstrings)

## Items
- `FLASH_ATTENTION_MIN_CUDA_CAPABILITY = (8, 0)` disables flash on all Turing (sm_75/T4) though flash-attn 2.x forward supports some head dims on Turing — add a comment explaining why 8.0 not 7.5.
- `flash_attention_unsupported_model_reason` can append "qk head dim ..." twice when a model sets both `head_dim` and `qk_nope_head_dim`/`qk_rope_head_dim` — dedupe the reason.
- `common.py` top docstring still mentions converting between "FlashAttention and SDPA" conventions, but `sdpa_window_size` was removed — clean up stale text.

## Acceptance
- Comments/docstrings reflect current behavior; reason list deduped. No behavior change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clean up native attention nits (cc threshold comment, duplicate reason, stale docstrings) #84

Items

Acceptance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Clean up native attention nits (cc threshold comment, duplicate reason, stale docstrings) #84

Description

Items

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions