2026 AI & Machine Learning
WWDC26 · 14 min · AI & Machine Learning
Run local agentic AI on the Mac using MLX
Run AI agents locally with privacy, low latency, and offline access. Dive into how MLX advancements and Mac hardware make powerful agentic workflows possible entirely on-device. You’ll explore code agents such as OpenCode, see how they integrate into Xcode, learn techniques for multi-Mac scaling, and discover how to integrate tools seamlessly — without ever leaving your machine.
Watch at developer.apple.com ↗Chapters
Code shown on screen · 3 snippets
Set up MLX-LM and start the local server
# Step 1: Install MLX-LM
pip install mlx-lm
# Step 2: Start the server
mlx_lm.server --model mlx-community/Qwen-3.5-4B-8bit
# Step 3: Point your agent to the server
curl -X POST \
http://127.0.0.1:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"default_model","messages":[{"role":"user","content":"Hello!"}]}' Configure an agent to use your local MLX server
{
"$schema": "https://opencode.ai/config.json",
"model": "mlx/default_model",
"small_model": "mlx/default_model",
"provider": {
"mlx": {
"npm": "@ai-sdk/openai-compatible",
"name": "MLX (local)",
"options": {
"baseURL": "http://127.0.0.1:8080/v1"
},
"models": {
"default_model": {
"name": "Default MLX Model"
}
}
}
}
} Launch distributed inference with MLX
mlx.launch --hostfile hosts.json \
--backend jaccl \
/remote/path/to/mlx_lm.server \
--model mlx-community/Qwen-3.5-122B-A3B-8bit Resources
Related sessions
-
22 min -
15 min -
19 min -
20 min