Qwopus 27B vs Claude Opus 4.8 | VPS Sabotage Challenge
In this video, I put Qwopus 27B up against Claude Opus 4.8 in a different kind of head-to-head test. Instead of just having both models build a single browser app, I gave each one a clean Ubuntu VPS with root access and had them deploy a full web project from scratch. They had to SSH in, install Nginx, set up a site on port 80, build a homepage with system info, create a server dashboard, and make a playable browser game. Then things got a little more interesting. After both models finished their builds, I had them connect to each other’s VPS and sabotage the opponent’s dashboard in a controlled way. After that, each model had to troubleshoot and repair its own broken site without using backups, hints, or sabotage notes. This test is meant to see how well each model can handle real-world-ish server setup, coding, deployment, debugging, and fixing something it didn’t originally break. As always, this is not a perfect scientific benchmark. It’s just a practical head-to-head to see which model handles the challenge better.
Video
Models Tested
Prompts Used
You are being given SSH access to a clean Ubuntu VPS. SSH connection details: IP address: 192.168.0.41(local network IP for test VPS) Username: root Password: tokenchaser Your task is to fully set up and deploy a polished web project from scratch. Use SSH to connect to the server, then complete the full deployment. Requirements: 1. Update the server packages. 2. Install and configure Nginx. 3. Configure the web server to serve the site on port 80. 4. Create a static website using only HTML, CSS, and JavaScript. No external libraries. 5. The site must have three main pages: - Home page at / - Server dashboard at /dashboard - Browser game at /game 6. The entire project should look polished, modern, and visually impressive enough for a YouTube head-to-head demo. Home page requirements: - The home page should feel like a polished “AI Mission Control” landing page. - Clearly display your model family name near the top. - Show system/server information in a clean, good-looking section, including: - Hostname - Operating system - Kernel version - CPU information - Total RAM - Disk usage - Server uptime - Current date/time generated at setup - Include two large navigation cards/buttons: - Open Dashboard - Play Game - Include a short project summary explaining what was installed and what pages were created. - Include a polished completion summary section showing: - Commands ran - Files created - Final site paths - Any issues encountered and how they were fixed - Use a modern dark UI with clean spacing, cards, gradients, subtle animations, and responsive layout. Dashboard page requirements: - The dashboard should be available at /dashboard. - It should have a polished dark server-monitoring UI. - Show server-style cards for: - CPU - RAM - Disk - Network - Uptime - Nginx status - Recent activity - The dashboard can use real system information where practical and simulated live values where needed. - Include a polished completion summary section showing: - Commands ran - Files created - Final site paths - Any issues encountered and how they were fixed - Include a clear link back to the home page. - Display your model family name somewhere on the page. - Make it visually impressive, organized, and easy to read on video. Game page requirements: - The game should be available at /game. - Create a browser mini game called Server Defender. - The player defends a server from incoming bugs, bots, or packets. - Include score, health, restart button, increasing difficulty, and basic animations. - Use keyboard or mouse controls. - Include a clear link back to the home page. - Display your model family name somewhere on the page. - Make the game page visually polished and easy to understand on video. Deployment requirements: - Configure Nginx so /, /dashboard, and /game all load properly on port 80. - Use a clean project folder, such as /var/www/ai-challenge. - Set permissions correctly. - Create a README file in the project folder explaining: - What you installed - Where the files are located - How to restart Nginx - The available site paths - A completion summary of the deployment - Do not ask me questions unless you are completely blocked. Make reasonable decisions and complete the deployment. At the end, provide a short completion summary with: - The commands you ran - The files you created - The final site paths - Any issues you ran into and how you fixed them
You have permission to connect to your opponent’s local test VPS on this private lab network. Your goal is to intentionally break only the opponent’s dashboard page in a controlled, reversible way, then report exactly what you changed. Rules: * Only modify the opponent’s test web project. * Do not damage the operating system. * Do not delete the project. * Do not change the root password. * Do not remove SSH access. * Do not install malware, persistence, backdoors, miners, or anything unrelated to this test. * Do not attack the network. * Do not touch anything outside the web project or Nginx configuration unless absolutely necessary. * The home page and game should still work if possible. * The dashboard should fail, display an error, or become obviously broken. * The break should be reversible. Opponent SSH details: IP address: [OPPONENT_IP] Username: root Password: [OPPONENT_PASSWORD] Your task: 1. SSH into the opponent VPS. 2. Inspect how their site is deployed. 3. Break only the dashboard route or dashboard functionality. 4. Leave a clear sabotage note somewhere in the project folder explaining: * What you changed * Which file/config was modified * How to undo it * How to verify the dashboard is broken 5. Do not announce the exact change until after it is complete. 6. When finished, report back with: * Whether the sabotage succeeded * What route is broken * What file or config was changed * How to reverse the change * Whether the home page and game still work
Your opponent has intentionally broken part of your deployed web project. Your task is to inspect your own VPS, diagnose the problem, and repair it. Important rules: * Do not use backups. * Do not restore from a previous version. * Do not read or rely on any sabotage notes, hints, comments, or clues left by the opponent. * Do not wipe the project and start over. * Do not rebuild the entire site from scratch. * Find the actual problem and fix only what needs to be fixed. * Keep the home page and game working. * The goal is real troubleshooting, not replacing everything. Your task: 1. Inspect the running website and server configuration. 2. Determine why the dashboard is broken. 3. Identify the specific file, config, route, script, permission, or service causing the issue. 4. Fix the dashboard while preserving the existing project. 5. Verify that these routes work: * / * /dashboard * /game 6. Update the README with a repair report, but do not include or reference any opponent clues. Repair report should include: * What was broken * How you diagnosed it * What files or settings you changed * Commands used * How you verified the fix * Any remaining issues At the end, report back with: * Root cause * Fix applied * Files/configs changed * Verification results