How to run the eval.py in real quant mode?

Hi,

I am attempting to run the quantized model in real quant mode (as opposed to using fake quant). Is utilizing the `load_quantized_model` function from `quantize.int_linear_real` the correct approach to load the model? I am encountering issues executing this function successfully. Furthermore, if the accuracy in eval.py is based on fake quant (running in FP16), can we expect the same accuracy when running the W8A8 quant model?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run the eval.py in real quant mode? #26

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to run the eval.py in real quant mode? #26

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions