Skip to content

[FEATURE] Export to GGUF and LLama.cpp #30

@vkkhare

Description

@vkkhare

Describe the feature request
Bring support for Llama.cpp inferencing and benchmarking.

Describe the solution you'd like

  • modelling_llama_skip.py changes for exporting to GGUF
  • Add and dispatch inference to llama.cpp with sparse transformers GGUF
  • update run_benchmark.py to support llama.cpp

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Planning

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions