← Studio/ Compare models
A/B compare

Same site. Five models. Pick the best brief.

One brand analysis prompt runs through Claude Sonnet 4.6, Opus 4.7, GPT-5.5, GPT-5.5 Pro and Gemini 3.1 Pro in parallel. We tally tokens, cost and latency — and you eyeball which output captures the brand.

Models without configured API keys (ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_AI_API_KEY) will show a clear "not configured" state.

How comparison works

Step 1

Same prompt, every model

Each model gets the identical brand-analysis prompt with the same site HTML excerpt and layout catalog. No model-specific tuning.

Step 2

JSON-only output

All models return the same JSON schema (brand personality, pitch, palette, top template, etc.) — apples to apples.

Step 3

Real cost tracking

We read the usage object from each provider and compute USD cost at current published rates.

Step 4

Side-by-side judgement

You read the briefs. The metrics tell you what each one costs. You decide which model captures your brand best.