The easiest way to create a server to inference MLX models.
For documentation and guides, visit mlxserver.com.
View all of the support models here.
Install mlxserver via pip to get started. This installation will install mlx as well.
pip install mlxserverTo install from PyPI you must meet the following requirements:
- Using an M series chip (Apple silicon)
- Using a native Python >= 3.8
- macOS >= 13.5
The following is an example of using Mistral 7B Nous Hermes 2 for generating text:
Python
from mlxserver import MLXServer
server = MLXServer(model="mlx-community/Nous-Hermes-2-Mistral-7B-DPO-4bit-MLX")Curl
curl -X GET 'http://127.0.0.1:5000/generate?prompt=write%20me%20a%20poem%20about%the%20ocean&stream=true'This library only runs on Apple Metal. The MLX library focuses on Apple Metal acceleration.
This library was made by Mustafa Aljadery & Siddharth Sharma.