feat: support QMD_LLAMA_RELEASE env var to use latest llama.cpp#658
Open
wzgrx wants to merge 1 commit into
Open
feat: support QMD_LLAMA_RELEASE env var to use latest llama.cpp#658wzgrx wants to merge 1 commit into
wzgrx wants to merge 1 commit into
Conversation
Allows users to override the default pinned llama.cpp version (b8390) by setting the QMD_LLAMA_RELEASE environment variable. Example: QMD_LLAMA_RELEASE=master qmd status This forces QMD to update node-llama-cpp's configuration to download and build the specified release (e.g. master branch) on the next run. Fixes: #0
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
QMD currently relies on
node-llama-cpp's default behavior, which pins thellama.cppversion to an older release (e.g.,b8390). This prevents users from accessing the latest features, model support, and performance improvements available in themasterbranch ofllama.cpp.Solution
This PR introduces the
QMD_LLAMA_RELEASEenvironment variable.When set (e.g., to
master), QMD automatically patchesnode-llama-cpp's internal configuration (binariesGithubRelease.json) before initializing the LLM engine. This forcesnode-llama-cppto download and build the specifiedllama.cpprelease.Usage
Changes
src/utils/llama-version.ts: New utility functionpatchLlamaReleaseIfNeededthat handles the patching ofnode-llama-cpp's release config.src/llm.ts: Invokes the patch function beforegetLlamais called.CLAUDE.md: Documentation update for the new environment variable.This allows power users to opt-in to the latest engine without waiting for a formal QMD release, while preserving the stable default behavior for everyone else.