Models

Apple Silicon

IMPORTANT: On Apple devices run LLMs on Apple Silicon with MLX (opens in a new tab).

If you're running regular (non-MLX) models from huggingface, you'll get worse LM performance (slower response generation by LMs, less amount of tokens per second) due to running LMs with Pytorch backend, not optimized for best performance on Apple Silicon.

120+ toks/sec on an M2 Ultra! With the latest MLX, 4-bit Mistral 7B (opens in a new tab)

Local LLMs on Apple Mac - powered by MLX! (opens in a new tab)

MLX community models

MLX community models (opens in a new tab)

mlx-vlm

Blaizzy/mlx-vlm (opens in a new tab)

MLX-VLM is a package for running Vision LLMs locally on your Mac using MLX.

Library Allenai Models