Dense Fine-tune · 27B

Qwopus3.6-27B v2 (dense)

by Kyle Hessling · v2 update to the dense 27B preview

Same suite as the 35B-A3B eval (5 agentic + 1 nothink rerun, 5 web-design, 4 canvas). 2 canvas outputs excluded for visual quality and parked in excluded-canvas/. Thinking is on for every run. Q5_K_M on a single RTX 5090 via llama.cpp.

Read the full report Compare · 35B-A3B eval Follow @KyleHessling1

43.9avg tok/s

15 / 17published · 2 canvas parked

75.25%SWE-bench Verified (202)

119,036completion tokens

~31 GBVRAM · 160K fp16 KV

SWE-bench Verified · controlled-202 slice

Run	Sampling	Resolved	Empty	Resolve %
Qwopus 3.6 27B v2 (dense)	temp 1.0, step 275, single-slot	152 / 202	1	75.25%

19h29m wall-clock on a single RTX 5090, 160K fp16 context. Every instance exited Submitted, 0 step-limit hits, 0 context overflows. Median trajectory length 67 / 275.

⚡

Run agentic harnesses hot. Counter-intuitive but consistent across our runs: for agentic harnesses with thinking-on, temp=1.0 outperforms temp=0.1 by a wide margin. Greedy decoding hands the finetune its strongest single-path reasoning chain back to itself every step, which is the recipe for over-deliberating, looping inside <think>, and the empty-patch failure mode. Raising temperature lets the finetune use the breadth of reasoning paths the training installed instead of refining one. The 78 to 1 collapse in empty patches between our 35B-A3B temp-0.1 and this 27B dense temp-1 run is the cleanest case we have. For one-shot creative HTML, drop back to 0.6 to 0.8 and try the slider for your workload.

Web design · open to preview

SaaS landing pageAI observability product page

60.3 KB · 23,801 tok · 552 s · thinking-on

Analytics dashboardLight-theme dashboard layout

42.1 KB · 15,390 tok · 354 s · thinking-on

Designer portfolioKinetic-typography portfolio

32.5 KB · 11,612 tok · 265 s · thinking-on

Pricing page3 tiers + animated toggle + FAQ

26.6 KB · 9,360 tok · 213 s · thinking-on

Mobile app marketingApp landing with device mock

42.3 KB · 16,590 tok · 382 s · thinking-on

Canvas / WebGL · creative coding

Three of four run at temp=1.0 (thinking on); physics_sandbox stays at temp=0.75 since the first run shipped clean. The Mandelbulb shader and Three.js crystal scene rendered but weren't strong enough to publish; both outputs are parked in excluded-canvas/ for inspection.

Particle attractor3000-particle fluid swarm

9.4 KB · 4,308 tok · 97 s · temp 1.0 · 1,513 chars reasoning

Generative flowfieldInk-line agents on noise

13.9 KB · 7,237 tok · 163 s · temp 1.0 · 6,269 chars reasoning

Soft-body physics sandboxVerlet integration playground

18.0 KB · 6,827 tok · 154 s · temp 0.75 · 1,665 chars reasoning

Audio-reactive visualizerFFT bars + bloom on mic input

10.7 KB · 5,731 tok · 129 s · temp 1.0 · 7,645 chars reasoning

↗

Lineage: Qwen 3.5 27B → Qwopus 3.5 27B → Qwen 3.6 base → Qwopus 3.6 27B Each step has been a real jump. The Qwen 3.5 → Qwopus 3.5 finetune was a big lift in front-end execution; the next big jump came from Alibaba raising the floor with the Qwen 3.6 base. The gap between base and finetune is narrowing, but Qwopus is still meaningfully better at executing creative briefs. The designer portfolio in this run is the best one-shot pass I've seen anywhere in this size class; the base produces something competent on the same prompt, the finetune turns it into something with a point of view.

Agentic reasoning · text output

Multi-step planningURL shortener deploy plan

thinking: 2,238 tok · 50 s · 7,067 chars reasoning

Tool-use planningWeather + flights + hotel

thinking: 1,262 tok · 28 s · 2,807 chars reasoning

Code debug4-bug k-th smallest element

thinking: 1,753 tok · 39 s · 5,225 chars reasoning

Structured JSON extractionCalendar + roster from prose

thinking: 1,721 tok · 39 s · clean pass

Self-critique loopPalindrome · iterate to O(n²)

thinking: 1,255 tok · 28 s · 3,309 chars reasoning

JSON extraction · no-thinkSame prompt, thinking off

351 tok · 8 s

Model: Qwopus3.6-27B v2 (dense) · Q5_K_M GGUF · llama.cpp CUDA-12.8 on a single RTX 5090 · 160K ctx, fp16 KV, single slot