Local LabTested
LM Studio vs oMLX on a MacBook Pro: The 35B Local AI Test That Actually Changed My Default
A practical benchmark of LM Studio and oMLX on a MacBook Pro running a 35B local model, including sustained generation and prefix-cache behavior.
Setup
The benchmark compared LM Studio and oMLX on a MacBook Pro Mac17,8 with an Apple M5 Pro, 18-core CPU, 20-core GPU, and 48 GB unified memory. Both runtimes used the Qwen3.6 35B A3B model family with temperature 0.2, top-p 0.9, min-p 0.05, repeat penalty 1.05, top-k 0, one active session/request, and controlled benchmark prompts.
Findings
LM Studio produced 5382 completion tokens in 60.78s at 88.55 tok/s with a 0.23s time to first token. oMLX produced 6144 completion tokens in 70.47s at 87.19 tok/s with a 4.17s time to first token. In the shared-prefix test, LM Studio improved from 3.37s to 0.27s TTFT, while oMLX with cache enabled improved from 6.95s to 3.78s and reported 6144 cached prompt tokens.
Verification Proof Path
Claim
Hype Audit
Deconstruct the marketing claims, checking for verification risks.
Setup
Local Assembly
Rebuild the workflow in a local, private container environment.
Benchmark
Runtime Testing
Measure execution speeds, resource usage, and token response latency.
Workflow
Efficiency Compression
Streamline the processes into reusable, repeatable scripts.
Verdict
Tool Rating
Final rating and practicality score determination.
Sources
Final LM Studio vs oMLX 35B Hermes RunAI Efficiency Toolbox · Jun 7, 2026
Consolidated LM Studio vs oMLX FindingsAI Efficiency Toolbox · Jun 7, 2026
Share
Join the discussion
Log in with an account to comment. Comments are reviewed before they appear.
Log in to comment