Skip to content

ivanvmoreno/open-translate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌍 open-translate

Deploy on RunPod Docker Hub

A high-performance, self-hostable translation API compatible with Google Cloud Translate. Built on Meta's NLLB-200 and optimized with DeepSpeed for efficient GPU inference.


✨ Why?

This project provides a robust, private, and cost-effective alternative to commercial translation APIs.

  • πŸ’° Cost Efficiency: Run on your own GPU infrastructure. Ideal for high-volume translation tasks.
  • πŸ”’ Data Privacy: No external API calls mean your content never leaves your control.
  • πŸ”„ Drop-in Compatibility: Implements the standard POST /language/translate/v2 API surface. Switch existing applications simply by changing the base URL.
  • 🌍 Advanced Models: Leverages Meta's NLLB-200 (No Language Left Behind), supporting 200+ languages.
  • πŸš€ High Performance: Optimized for throughput with DeepSpeed and Tensor Parallelism, capable of handling heavy concurrent loads.

⚑ Drop-in Replacement

Designed to work with existing Google Cloud Translate client libraries and integrations.

Before: https://translation.googleapis.com/language/translate/v2

After: http://localhost:8000/language/translate/v2


πŸš€ Quick Start

🐳 Run with Docker

This command launches the API on port 8000 using the 600M distilled model.

docker pull ivanvmoreno/open-translate:latest
docker run --gpus all -p 8000:8000 \
  -e NLLB_MODEL_SIZE=600M \
  -e DTYPE=fp16 \
  ivanvmoreno/open-translate:latest

Note: The first run downloads the model weights, which may take some time depending on your internet speed.


πŸ› οΈ API Reference

Compatible with Google Cloud Translation API v2.

Translate Text

POST /language/translate/v2

Single Translation:

curl -X POST "http://localhost:8000/language/translate/v2" \
  -H "Content-Type: application/json" \
  -d '{
    "q": "Hello world!",
    "target": "es"
  }'

Batch Translation: Send arrays of strings to maximize GPU throughput.

curl -X POST "http://localhost:8000/language/translate/v2" \
  -H "Content-Type: application/json" \
  -d '{
    "q": ["Hello world!", "Self hosting rulez"],
    "target": "fr",
    "source": "en",
    "max_new_tokens": 128
  }'

Language Detection

POST /language/translate/v2/detect

curl -X POST "http://localhost:8000/language/translate/v2/detect" \
  -H "Content-Type: application/json" \
  -d '{"q": "Hola mundo"}'

List Supported Languages

GET /language/translate/v2/languages

curl "http://localhost:8000/language/translate/v2/languages"

βš™οΈ Configuration

Variable Default Description
NLLB_MODEL_SIZE 1.3B-distilled Model size: 600M, 600M-distilled, 1.3B, 1.3B-distilled, or 3.3B
NLLB_MODEL_ID (None) HF model override
TP_SIZE auto Tensor Parallel size
DTYPE fp16 fp16, bf16, or fp32
MAX_BATCH_SIZE 32 Max sentences processed in parallel
HOST 0.0.0.0 Bind host
PORT 8000 Bind port

🌐 Language Codes

We support standard ISO 639-1 (e.g., es, en) and BCP-47 (e.g., zh-TW, pt-BR) codes, automatically mapping them to NLLB's internal representation.

For a full list of over 200 supported languages and their codes, see LANGUAGES.md.


πŸ’Ύ VRAM Requirement Guide

Model Size FP16 / BF16 FP32
600M / 600M-distilled ~3 GB ~5 GB
1.3B / 1.3B-distilled ~5 GB ~9 GB
3.3B ~9 GB ~15 GB

About

🌍 Self-hostable NLLB-200 translation API compatible with Google Cloud Translate

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors