Skip to content

optimizer: gradients should not be stored as SYM_INT32 (compute-format ≠ storage-format) #261

Description

@LeoBuron

Conceptual finding

SYM_INT32 is the framework's compute representation — an int32 mantissa plus one per-tensor float scale — used because integer kernels (matmul / conv / backward) are the only integer-math path we support. It is not a storage format: it occupies the same 4 bytes/element as FLOAT32 but represents a fixed-point approximation (one scale for the whole tensor), so small-magnitude values lose precision that FLOAT32 (per-value exponent) keeps.

Storing a gradient as SYM_INT32 is therefore dominated on both axes:

  • vs FLOAT32 — identical footprint, strictly worse fidelity (one scale, fixed-point relative error).
  • vs SYM / ASYM — no memory saving (those sub-byte-pack; SYM_INT32 is full int32).

The only place a gradient legitimately takes SYM_INT32 form is transiently, as an operand wire during backprop (dx/agrad feeding the next layer's integer backward — allocated int12 in initGradTensor, freed after the pass). The persistent parameter gradients (weightGrad/biasGrad) have no such reason.

Current state

  • gradInitSymInt32 (src/userApi/tensor/TensorApi.c:281) stores parameter grads as SYM_INT32 (ODT_SYM_GRAD_QMAXBITS = 16).
  • The SGD SYM_INT32 path (src/optimizer/Sgd.c) already dequantizes grad→float, steps in float, requantizes — i.e. it treats the grad as float internally, so the SYM_INT32 storage buys nothing and adds a lossy round-trip.
  • optimizer: sgdStepM SYM_INT32 does not requantize the grad back (asymmetry vs sgdStep) #203 (sgdStepM grad write-back asymmetry) is a symptom: the two SYM_INT32 SGD variants disagree on whether to requantize the grad back, precisely because "the grad in SYM_INT32 after a step" has no well-defined meaning.

Proposed direction (to design)

Gradients should be stored as FLOAT32 (same size, better fidelity) or SYM/ASYM (if compression is wanted); the integer math stays a transient SYM_INT32 step. Likely:

  • parameter grads default to FLOAT32 storage (retire / repurpose gradInitSymInt32's SYM_INT32 default);
  • the optimizer consumes float grads directly (no grad dequant/requant), keeping the param-side quant handling;
  • the two-width operand/grad contract (int12 operands / int16 grads) largely dissolves — if terminal grads are FLOAT32, the grad-width question disappears.

Open design questions: bit-width for SYM/ASYM grad storage if compression is chosen; whether param storage is affected (separate concern — params are forward operands); feasibility of an integer optimizer step. To be resolved in a dedicated design pass.

Subsumes #203.

Relations

Part of the SYM completeness program #210; framework hardening under #137. Related: #221 (backwardQ not honored by grad-tensor allocation), #203 (symptom).

🤖 Generated with Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions