Qwen 3.6 27B is the sweet spot for local development

	Qwen 3.6 27B is the sweet spot for local development(quesma.com)
	1117 points by stared 1 day ago \| 699 comments
	tl;dr: Qwen 3.6 27B (dense) is praised as the first genuinely useful local general-purpose model, outperforming its faster MoE sibling (35B A3B) in quality while still running within 48GB of Apple Silicon RAM or a quantized RTX 5090. On a MacBook M5 Max, llama.cpp with multi-token prediction hits ~32 tok/s, and benchmarks place it roughly at mid-2025 GPT-5/Claude Sonnet 4.5 level. The author provides llama.cpp and OpenCode setup instructions and argues local models are increasingly viable alternatives to subsidized frontier APIs.
	HN Discussion: ~MacBook is impractical for local LLM work due to heat and noise; use a Mac Mini instead ↓The hardware cost ($6.7K-$10K) is prohibitive and API credits are far more economical ↓Benchmarks don't reflect real work; local models struggle with existing codebases ~Cheaper alternatives like Intel Arc Pro or MoE models on lesser hardware work well ~Other models like Gemma4 31B are comparably good and underrated