
LLM TestingJune 12, 2026
I Made Fable 5 and Qwen3.6 27B Build the same Web App
Today I put Qwen3.6 27B and Claude Fable 5 head-to-head with the same challenge: build a real LLM benchmark dashboard on a fresh local VPS. The goal wasn’t to make a fake demo or a pretty mockup. I wanted a working product that could connect to my llama-swap endpoints, load models, run benchmark prompts, save results, and compare historical runs with real charts, stats, and benchmark data. Both models had to: - work from a fresh VPS - install whatever they needed - expose the dashboard on port 80 - build something that actually works - turn it into a tool I could keep using later If you’re into local AI, llama.cpp, llama-swap, coding agents, and real-world model battles, this is exactly the kind of chaos you’re here for.
2 models


