Skip to content

Fix: dedupe model/quantization types, buffer downloads, GGUF offsets and device scanner isolation#4

Open
NightVibes33 wants to merge 8 commits into
mainfrom
codex/fix-critical-issues-in-modelquantizer
Open

Fix: dedupe model/quantization types, buffer downloads, GGUF offsets and device scanner isolation#4
NightVibes33 wants to merge 8 commits into
mainfrom
codex/fix-critical-issues-in-modelquantizer

Conversation

@NightVibes33
Copy link
Copy Markdown
Owner

Motivation

  • Resolve duplicate type declarations and conflicting QuantizationError enums that caused build errors and runtime mismatches.
  • Prevent catastrophic OOM/slow downloads by replacing per-byte writes with buffered writes and ensure destination files exist before opening FileHandle.
  • Make GGUF building and quantization safer and correct (stable tensor offsets, correct block-size math, correct Q4 nibble encoding) and avoid silently incorrect 5-bit fallbacks.
  • Harden device detection and scanning concurrency to avoid unsafe @unchecked Sendable usage and potential reentrancy issues.

Description

  • Centralized model/domain types and errors into ModelTypes.swift and removed duplicate declarations from ModelQuantizer.swift, keeping ModelQuantizer focused on orchestration.
  • Buffered streamed downloads in ModelQuantizer.downloadModel and HuggingFaceAPI.downloadModelFile with a 64KB buffer and ensured destination files are created before opening FileHandle.
  • Fixed Hugging Face fallback to use the API file-tree endpoint (/api/models/{id}/tree/main) and added a private getAuthToken() accessor into the HF API extension.
  • Stabilized GGUFBuilder offsets by writing tensor-info into a temporary buffer before emitting tensor data and removed duplicate integer-to-data extensions.
  • Improved QuantizationEngine behavior by inferring architecture from config.json model_type, clamping/offsetting Q4 quantized nibbles into [-8,7] + 8, computing block sizes with ceil (i.e. (numElements+31)/32 * elementSize), and changing q5_0/q5_1 to explicitly error until correctly implemented.
  • Hardened DeviceScanner by making it @MainActor final (removed @unchecked Sendable), replacing Timer with a cancellable Task loop, adding deviceIdentifier to the profile, and adding simulator passthrough for model identifier parsing.
  • Replaced iphone.gen3 SF symbol usage with the safer iphone fallback in views for broader compatibility.

Testing

  • Ran git diff --check to verify whitespace/formatting issues and it returned no problems.
  • Confirmed swift --version is available in this environment but an iOS build could not be executed because xcodebuild is not present (xcodebuild: command not found).
  • No automated unit/integration tests were present or run in this environment, so functional build and runtime verification should be performed in an Xcode environment (iOS device/simulator) before release.

Codex Task

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant