Updated 2026-05-06: corrected SDK deprecation timeline, removed false claim about AI Studio/Vertex AI merger, updated model names to current lineup, clarified GPT-4.1 mini API-only status vs GPT-5.x generation.
When we built AIRoundtable, we had one job: find the best AI models for live debate and wire them together. Claude was an obvious choice. GPT-4.1 mini made the shortlist. Gemini? We evaluated it. We passed.
This isn’t a hit piece. It’s an honest builder’s log about a real decision we made.
The SDK Problem
The first thing you run into with Google’s AI ecosystem is the SDK churn.
@google/generative-ai — the original JavaScript SDK — went through a cycle of breaking changes throughout 2025 and was then deprecated entirely in November 2025. The repository was archived on December 16, 2025. The replacement is a completely different package: @google/genai, the new Google GenAI SDK, which reached General Availability in May 2025.
So by the time we shipped, the SDK we had been evaluating was end-of-life. Any integration we had built against @google/generative-ai would have needed a full rewrite to work with @google/genai.
For a project where reliability is the product — you can’t have a live debate freeze mid-argument because a dependency changed — this is a dealbreaker. The REST API is stable and @google/genai is GA now. That’s what we’d use if we ever add Gemini. But at launch, we wanted to move fast without betting on a freshly-GA SDK.
Context vs. Speed
Gemini 2.5 Flash has a 1M token context window. That’s genuinely impressive for long-form tasks. For a debate platform with 6 rounds of ~150-word arguments each, that’s roughly 1,000 tokens of context at peak. We’re not using the headline feature.
What matters for debates is latency to first token. Claude Haiku 4.5 (Anthropic’s current fast tier, released October 2025) and GPT-4.1 mini both start streaming quickly on short-form generation. Gemini Flash is fast on throughput — but our tests during the evaluation period showed unpredictable first-token jitter through the SDK layer.
In a live-streamed debate where the audience is watching a cursor blink, a 300ms delay feels like a pause. A 1.2s delay feels like a crash.
The Google Dependency Problem
There’s a more philosophical reason too.
AIRoundtable is a Wender Media experiment. We build things we control. Anthropic has a stable API, clean documentation, and a predictable versioning strategy. OpenAI is expensive but boring in the good way — you know what you’re getting.
Google is a different kind of company. They kill products. They restructure teams. They rename APIs. The @google/generative-ai SDK was deprecated less than a year after launch. Google AI Studio and Vertex AI remain separate products with separate documentation, authentication flows, and SDK surfaces — which creates its own integration overhead.
Depending on Google for a core component of a live AI platform means inheriting their pace of change.
What We’d Need to Change Our Mind
We’re not closed to Gemini. If we see:
- 12 months of stable SDK (
@google/genaiwith no breaking changes in GA) - Benchmark data showing consistent latency parity with Claude Haiku 4.5 on short-form generation
- A clear long-term commitment from Google to the Gemini API surface — not just the underlying models
…then Gemini earns a seat at the table. Literally.
Until then, our CONTRA side runs GPT-4.1 mini — still available via the OpenAI API and stable for our workload, even though OpenAI has since moved on to the GPT-5.x family (GPT-5.4 mini is their current recommended small model as of March 2026). We’ll evaluate a GPT-5.x upgrade once our latency benchmarks for the new generation are complete. The PRO side stays with Claude.
The Irony
The funniest part of this whole decision? We’re writing about it on a site built with Astro, deployed on Netlify, tracked by (minimal, privacy-first) analytics, and served through Hetzner.
No Google in the stack anywhere. It wasn’t a political decision. It’s just that for every piece of this project, something else worked better.
That’s the actual evaluation framework: does it work? Is it stable? Can I rely on it in production?
Gemini isn’t there yet — for our use case. Maybe it will be.
AIRoundtable is an open experiment by Wender Media. The debate platform is live at app.airoundtable.de. No account needed.