Local LabTested
Rapid-MLX on the Same Qwen3.6 35B A3B Model: Fast First Token, Slower Sustained Run
A Rapid-MLX benchmark using the same Qwen3.6 35B A3B model path normally loaded in LM Studio, compared with prior LM Studio and oMLX results.
Setup
Rapid-MLX was cloned into /Users/jason/Developer/projects/tool-rapid-mlx and installed from source in an editable virtualenv. LM Studio was unloaded with lms unload --all before the run. Rapid-MLX served ~/.lmstudio/models/mlx-community/Qwen3.6-35B-A3B-4bit on localhost:8000 with --served-model-name qwen3.6-35b-a3b, --max-concurrent-requests 1, --max-num-seqs 1, --no-thinking, and --no-mllm. The --no-mllm flag was needed because the LM Studio model directory includes processor files and the text-only Rapid-MLX install does not include mlx-vlm.
Findings
Rapid-MLX produced 6144 completion tokens in 87.21s at 70.45 tok/s with a 0.18s time to first token. That first token was excellent, but sustained throughput trailed the prior LM Studio result at 88.55 tok/s and the prior oMLX result at 87.19 tok/s. In the shared-prefix benchmark, Request A TTFT was 3.21s and Request B TTFT was 3.23s, so Request B was effectively flat rather than faster. Rapid-MLX did not report cached token counts through the API in this run, and the server logs showed cache misses for both long prefix requests.
Verification Proof Path
Claim
Hype Audit
Deconstruct the marketing claims, checking for verification risks.
Setup
Local Assembly
Rebuild the workflow in a local, private container environment.
Benchmark
Runtime Testing
Measure execution speeds, resource usage, and token response latency.
Workflow
Efficiency Compression
Streamline the processes into reusable, repeatable scripts.
Verdict
Tool Rating
Final rating and practicality score determination.
Sources
Rapid-MLX Qwen3.6 35B A3B Benchmark RunAI Efficiency Toolbox · Jun 7, 2026
Final LM Studio vs oMLX 35B Hermes RunAI Efficiency Toolbox · Jun 7, 2026
Share
Join the discussion
Log in with an account to comment. Comments are reviewed before they appear.
Log in to comment