A community org for MLX model weights that run on Apple Silicon. This organization hosts ready-to-use models compatible with:
These are pre-converted weights, ready to use in the example scripts or integrate in your apps.
Install mlx-lm:
pip install mlx-lm
You can use mlx-lm from the command line. For example:
mlx_lm.generate --model mlx-community/Qwen3-4B-Instruct-2507-4bit --prompt "hello"
This will download a Qwen3 4B model from the Hugging Face Hub and generate text using the given prompt.
To chat with an LLM use:
mlx_lm.chat
This will give you a chat REPL that you can use to interact with the LLM. The chat context is preserved during the lifetime of the REPL.
For a full list of options run --help on the command of your interest, for example:
mlx_lm.chat --help
To quantize a model from the command line run:
mlx_lm.convert --model Qwen/Qwen3-4B-Instruct-2507 -q
For more options run:
mlx_lm.convert --help
You can upload new models to Hugging Face by specifying --upload-repo to
convert. For example, to upload a quantized Qwen3 4B model to the
MLX Hugging Face community you can do:
mlx_lm.convert \
--model Qwen/Qwen3-4B-Instruct-2507 \
-q \
--upload-repo mlx-community/Qwen3-4B-Instruct-2507-4bit
Models can also be converted and quantized directly in the mlx-my-repo Hugging Face Space.
For more details on the API checkout the full README
For more examples, visit the MLX Examples repo. The repo includes examples of: