docs(prepare_model_for_kbit_training): note the ~0.5–1 GB CUDA reserved overhead (#3265)#3267
Open
Anai-Guo wants to merge 1 commit into
Open
docs(prepare_model_for_kbit_training): note the ~0.5–1 GB CUDA reserved overhead (#3265)#3267Anai-Guo wants to merge 1 commit into
Anai-Guo wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3265.
Why
#3265 describes that
prepare_model_for_kbit_trainingadds ~1 GB of CUDA reserved memory in ~500 ms for 7B-class models on top of the loaded weights, and that this overhead is not documented anywhere. On 8 GB unified-memory or consumer accelerators (Jetson Orin Nano 8 GB, Apple Silicon, RTX 4060 8 GB) this is the difference between a recipe that fits and one that OOMs.The issue author proposed three fixes (docs / new kwarg / lean variant). This PR takes Fix 1 (docs only, easy) so users can at least account for the overhead. The new-kwarg / lean-variant options can land separately if maintainers want them — happy to follow up.
What
Adds a
<Tip warning={true}>block to theprepare_model_for_kbit_trainingdocstring noting:No behavior change.
Test plan
make docs/ nbsphinx) renders the<Tip warning>block correctly.🤖 Generated with Claude Code