Skip to content

syalia-srl/embed-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

embed-server

Tiny OpenAI-compatible /v1/embeddings HTTP API, backed by fastembed. Drop-in for any client that already speaks the OpenAI embeddings API; runs the model locally with no per-call cost.

Install + run

uv sync
uv run embed-server   # listens on 127.0.0.1:8001

Environment variables: EMBED_HOST (default 127.0.0.1), EMBED_PORT (default 8001).

Use it

import openai
client = openai.OpenAI(base_url="http://127.0.0.1:8001/v1", api_key="local")
r = client.embeddings.create(model="jinaai/jina-embeddings-v2-base-es", input="hola")
print(len(r.data[0].embedding))  # 768

Or with curl:

curl -s http://127.0.0.1:8001/v1/embeddings \
  -H 'Content-Type: application/json' \
  -d '{"model":"jinaai/jina-embeddings-v2-base-es","input":"hola"}' | jq '.data[0].embedding | length'

Models are loaded on first request and cached in memory. The first call to a model triggers a download into ~/.cache/fastembed; subsequent calls are fast.

See the fastembed supported models list for valid model values.

Endpoints

Method Path Notes
POST /v1/embeddings OpenAI-compatible request/response
GET /health {"status":"ok","models_loaded":[...],"ts":...}

Deploy as a systemd service

See deploy/embed-server.service. Install:

sudo cp deploy/embed-server.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now embed-server

Edit WorkingDirectory, User, and the Environment=PATH in the unit to match the host. The unit assumes uv is on PATH and the repo is checked out at WorkingDirectory.

License

MIT

About

OpenAI-compatible embeddings HTTP API backed by fastembed

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages