Skip to content

Improve codebook utlization for FSQ #10

@xrhan

Description

@xrhan

Hi Flextok authors,

Thanks a lot for the great work and releasing the codebase.
I found that directly training with the FSQ can often lead to underutilized codebook.

I wonder during training if you find any augmentation or normalization techniques necessary to improve codebook usage and convergence?
For instance, I wonder what you set for the regularization/augmentation hyperparameters?

drop_quant_p: float,
corrupt_tokens_p: float,
min_corrupt_tokens_p: Optional[float],
apply_corrupt_tokens_p: float,

And does the model require REPA-style regularization during training?

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions