I Made Fable 5 and Qwen3.6 27B Build the same Web App
Today I put Qwen3.6 27B and Claude Fable 5 head-to-head with the same challenge: build a real LLM benchmark dashboard on a fresh local VPS. The goal wasn’t to make a fake demo or a pretty mockup. I wanted a working product that could connect to my llama-swap endpoints, load models, run benchmark prompts, save results, and compare historical runs with real charts, stats, and benchmark data. Both models had to: - work from a fresh VPS - install whatever they needed - expose the dashboard on port 80 - build something that actually works - turn it into a tool I could keep using later If you’re into local AI, llama.cpp, llama-swap, coding agents, and real-world model battles, this is exactly the kind of chaos you’re here for.
Models2
Prompts3
Live HTML0
Files0
Video
Models Tested
Qwen3.6 27B
Claude Fable 5
Prompts Used
1Phase 1 - The Setup
Build a working LLM benchmark dashboard on this fresh local VPS and expose it on port 80. Install whatever is needed. The app should connect to OpenAI-compatible llama-swap endpoints, let me save endpoint configs, load available models, select one model, choose a preset benchmark prompt, run it, and display the response and benchmark results in a clean modern dashboard.
2Phase - Expansion
Expand the dashboard into a real benchmarking product. For each run, capture and save the server, model, prompt, timestamp, response text, latency, output length, and tokens or throughput if available. Add a benchmark history view, summary cards, charts, and tables so I can review performance clearly.
3Phase 3 - Finishing Touches
Finish the product by adding persistence and comparisons. Save llama-swap configs, prompt presets, and benchmark history locally. Let me filter past runs by model, endpoint, prompt, and date, compare the current run against previous benchmarks, and view trends, averages, and best/worst runs in a polished dashboard on port 80.