Total Tests
47
Across 8 model matchups
Avg Tokens/sec
42.6
Qwen3.6: 58.3 | Claude: 26.9
Best Model
Qwen3.6 27B
Overall Winner — 6/10 categories
Total API Cost
$84.32
Claude Opus 4.8 — 28,450 prompts
Local Runtime
14h 22m
RTX 3090 — 47 test runs
⚔️ HEAD-TO-HEAD MATCHUP #47 — 2025.05.28
Qwen3.6 27B
Local • GGUF Q4_K_M • 16 GB VRAM • llama.cpp
Speed58.3 tok/s
Quality Score87.4 / 100
Coding Score84.1 / 100
Creativity76.8 / 100
Instruction Follow91.2 / 100
Final GradeA (87.4)
VS
Claude Opus 4.8
Frontier • API • Anthropic • Latest
Speed26.9 tok/s
Quality Score92.1 / 100
Coding Score93.7 / 100
Creativity94.5 / 100
Instruction Follow95.8 / 100
Final GradeS (92.1)
🏆 Winner Breakdown
📊 Quality Radar
📝 Prompt History
10 tested
📈 Token Throughput
📋 Previous Model Tests
Date Model Type Quality Coding Speed Cost Winner
📌 Testing Notes & Observations
Qwen3.6 Speed
58.3
tokens / second
Claude Opus Speed
26.9
tokens / second
Speed Ratio
2.17x
Qwen3.6 is faster
Avg Completion
4.2s
Qwen3.6 — per prompt
Avg Completion
8.9s
Claude Opus — per prompt
⚡ Tokens Per Second — By Prompt
⏱️ Completion Time — By Prompt
📈 Speed Trend — All Models
🎯 Quality Comparison
🎯 Radar Comparison
📊 Category Scores — Qwen3.6 27B
📊 Category Scores — Claude Opus 4.8
Local Cost (Qwen3.6)
$0.00
Already own the GPU
Claude API Cost
$84.32
This test session
Input Tokens
184,200
@ $15.00 / 1M tokens
Output Tokens
72,800
@ $75.00 / 1M tokens
Projected Yearly
$1,012
At current usage rate
💰 Cost Comparison — All Tests
📊 API Usage Breakdown
ModelClaude Opus 4.8
Total Prompts28,450
Input Tokens184,200
Output Tokens72,800
Input Cost$2.76
Output Cost$5.46
Per-Prompt Avg$0.00296
Total Spend (Session)$84.32
ProviderAnthropic API
Rate Limit HitNo
🖥️ RTX 3090 Monitor
Online
GPU Utilization
94%
VRAM Usage
14.2 / 24 GB
Temperature
72°C
Power Draw
342W
Fan Speed
78%
Clock Speed
1710 MHz
🌐 Local Server Status
Serverllama.cpp v3.2
ModelQwen3.6 27B Q4_K_M
Context Length8192 tokens
Batch Size512
Threads12
Flash AttentionEnabled
QuantizationQ4_K_M (4.2 bit)
Model Size16.1 GB
Load Time12.4s
Uptime14h 22m 08s
📈 GPU Utilization Over Time