Llama3 GGUF emits incoherent output

Large Llama3 GGUF models can emit incoherent text even when decoding stops normally.

Root cause found in the portable path: Llama3 needed the correct chat prologue and normal RoPE handling, not the Qwen/Phi NEOX RoPE path. The downstream app reproduced this with Meta-Llama-3.1-8B-Instruct-Q4_K_M and verified the fix with a deterministic 123+456 smoke returning 579.

Follow-up: keep the Llama3 prompt/RoPE behavior covered on the pie.app/v1-base-shmem line after PR #388.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Llama3 GGUF emits incoherent output #389

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Llama3 GGUF emits incoherent output #389

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions