Local Qwen isn't a worse Opus, it's a different tool(blog.alexellis.io)
321 points by alphabettsy 11 hours ago | 166 comments
tl;dr: A founder running OpenFaaS spent ~$12K on an RTX 6000 Pro to run local Qwen 27B, and reports it's not "near-Opus" but is valuable for specific tasks: airgapped customer support diagnostics, telemetry analysis (which uncovered a 4-5x license under-reporting and paid for the card), and bounded maintenance work where data privacy matters. The main weakness is infinite loops and hallucinations on long-horizon unsupervised tasks, making it unsuitable as a Claude/Codex replacement for general coding. Key takeaways: match local models to scoped tasks, respect tuning parameters, use AGENTS.md, and don't trust them unattended.
HN Discussion:
  • Local models are different tools requiring different prompting techniques, like instruments
  • Article underestimates rapid improvement; too early to lock in conclusions about local model limitations
  • ~vLLM was wrongly dismissed; it actually fixes looping and outperforms llama.cpp in many cases
  • Article confirms local models are limited, expensive, and unsuitable for complex agentic work
  • Local models could serve as intermediaries (tool calling, anonymizing) feeding frontier models