MLX-powered Qwen3 embedding server for Apple Silicon Macs. Features 0.6B/4B/8B models, 44K tokens/sec throughput, REST API, batch processing, and model hot-swapping, and more
-
Updated
Aug 9, 2025 - Python
MLX-powered Qwen3 embedding server for Apple Silicon Macs. Features 0.6B/4B/8B models, 44K tokens/sec throughput, REST API, batch processing, and model hot-swapping, and more
MCP server that gives AI agents semantic search tool. Builds AST, call graphs, type graphs, and hybrid semantic search — so your agent queries structured indexes instead of dumping files into context. 7 languages. Zero config. One command setup.
Semantic duplicate bug-report detection running fully in the browser — Qwen3 embeddings (recall) + Qwen3 reranker (precision) on WebGPU via Transformers.js
AI Powered Ethereum Intelligence API
Lightweight, fast, secure, and free document chat system powered by Qwen AI and TurboVec search.
Llama 3.2 and qwen3 embedding model were used Locally for implementing RAG on pizza reviews file
基于 Qwen3-Embedding 模型构建的高性能文本向量化服务,使用 FastAPI 提供 RESTful 接口,支持 ONNX 模型的转换和推理加速。
Use this advanced node (tool or embedding) for Qwen3 embeddings (fit all sizes, tested with 0.6B) install this as a community node into n8n from npm (n8n-nodes-qwen-embedding)
Add a description, image, and links to the qwen3-embeddings topic page so that developers can more easily learn about it.
To associate your repository with the qwen3-embeddings topic, visit your repo's landing page and select "manage topics."