I Made Fable 5 and Qwen3.6 27B Build the same Web App

Today I put Qwen3.6 27B and Claude Fable 5 head-to-head with the same challenge: build a real LLM benchmark dashboard on a fresh local VPS. The goal wasn’t to make a fake demo or a pretty mockup. I wanted a working product that could connect to my llama-swap endpoints, load models, run benchmark prompts, save results, and compare historical runs with real charts, stats, and benchmark data. Both models had to: - work from a fresh VPS - install whatever they needed - expose the dashboard on port 80 - build something that actually works - turn it into a tool I could keep using later If you’re into local AI, llama.cpp, llama-swap, coding agents, and real-world model battles, this is exactly the kind of chaos you’re here for.

Models2

Prompts3

Live HTML0

Files0

Video

Models Tested

Qwen3.6 27B

Q6_K_XL Unsloth

RTX 3090ti & RTX 3090

Claude Fable 5

Default/High

Anthropic

Prompts Used

1Phase 1 - The Setup

Build a working LLM benchmark dashboard on this fresh local VPS and expose it on port 80. Install whatever is needed. The app should connect to OpenAI-compatible llama-swap endpoints, let me save endpoint configs, load available models, select one model, choose a preset benchmark prompt, run it, and display the response and benchmark results in a clean modern dashboard.

2Phase - Expansion

Expand the dashboard into a real benchmarking product. For each run, capture and save the server, model, prompt, timestamp, response text, latency, output length, and tokens or throughput if available. Add a benchmark history view, summary cards, charts, and tables so I can review performance clearly.

3Phase 3 - Finishing Touches

Finish the product by adding persistence and comparisons. Save llama-swap configs, prompt presets, and benchmark history locally. Let me filter past runs by model, endpoint, prompt, and date, compare the current run against previous benchmarks, and view trends, averages, and best/worst runs in a polished dashboard on port 80.