| Qwen 3.6 27B is the sweet spot for local development(quesma.com) | |
| 1117 points by stared 1 day ago | 699 comments | |
tl;dr: Qwen 3.6 27B (dense) is praised as the first genuinely useful local general-purpose model, outperforming its faster MoE sibling (35B A3B) in quality while still running within 48GB of Apple Silicon RAM or a quantized RTX 5090. On a MacBook M5 Max, llama.cpp with multi-token prediction hits ~32 tok/s, and benchmarks place it roughly at mid-2025 GPT-5/Claude Sonnet 4.5 level. The author provides llama.cpp and OpenCode setup instructions and argues local models are increasingly viable alternatives to subsidized frontier APIs. | |
HN Discussion:
| |