Gemini 3.5 Pro Misses June GA Window: Six Weeks After Google I/O, Developers Are Still Waiting
TL;DR
Google promised Gemini 3.5 Pro general availability next month at I/O on May 19. It is June 24 and the model remains in limited enterprise preview only. Prediction markets put odds of a June 30 launch at 50-55%. Here is what 2M tokens and Deep Think mean in concrete cost and deployment terms.
My read: Gemini 3.5 Pro’s June slip puts Google in a disadvantaged position for Q3 enterprise procurement cycles. Claude Fable 5 moved behind a paywall on June 23, OpenAI is adjusting o3 pricing simultaneously, and the window for a neutral evaluation environment is closing. If you are running flagship LLM evaluations for enterprise right now, have you already factored out the Pro delay, or are you still holding the decision for it? And if you have found a way to run a credible comparative evaluation without a GA release, I would genuinely like to know how.
On May 19, 2026, at the end of Google I/O, Sundar Pichai told the audience that Gemini 3.5 Flash was available that day, and Gemini 3.5 Pro would reach general availability next month. Six weeks have now passed. As of June 24, Gemini 3.5 Pro remains in limited enterprise preview on Vertex AI, with no official pricing published.
Prediction market Polymarket currently estimates a 50-55% probability of a release before June 30.
What Was Promised and Where Things Stand
Gemini 3.5 Flash launched on schedule at I/O. It is now deployed across the Gemini API, Google Search AI Mode, Google Antigravity, and several other channels. According to Google’s official release post, Flash scores 76.2% on Terminal-Bench 2.1 and 84.2% on CharXiv multimodal reasoning. These are Google’s own numbers without independent verification, but at least Flash has been running in real environments for six weeks.
Pro is a different situation. TechTimes reported on June 6 that the model was nearing launch, but as of June 24, only a limited enterprise preview through Vertex AI is available. June 30 is the last day the “next month” promise technically holds. Six days remain.
Three Reasons the Pro Spec Sheet Generated Genuine Interest
The 2-million-token context window is the most-discussed spec in the announced feature set. In production models, Claude Opus 4.8 tops out at 200K tokens and GPT-4.5 at 128K. If Gemini 3.5 Pro actually delivers 2M tokens without quality degradation at context length, that is a structural advantage for enterprise workflows processing complete codebases, full legal documents, or long-form financial packages — not just a headline number.
Deep Think reasoning mode is Google’s implementation of the slow-thinking model category: plan before generating, reflect mid-response, then output. The design intent is comparable to OpenAI o3’s chain-of-thought and Claude’s extended thinking. Whether the actual reasoning quality is differentiated requires independent benchmarking post-launch. Until then, Deep Think is a feature description, not a measured result.
Frontier multimodal understanding is the positioning core of the 3.5 series. Flash already scores 84.2% on CharXiv, showing measurable progress on understanding structured scientific visuals. Pro is expected to push that further, but independent capability verification requires a live model.
What the Numbers Actually Mean
Estimated pricing puts Gemini 3.5 Pro at roughly 10x Gemini 3.5 Flash on input tokens, translating to approximately $15 per million input tokens and $60 per million output tokens.
Translate that to a real workload: filling a single 2M-token context window costs $30 in input alone. An enterprise pipeline running 100 long-context calls per day lands at roughly $90,000 per month. At that price point, Pro is positioned as a high-stakes decision model rather than a conversational workhorse.
Flash’s benchmark numbers are worth unpacking. Terminal-Bench 76.2%, GDPval-AA Elo 1656, MCP Atlas 83.6% — all Google-measured. The LMSYS Chatbot Arena Elo ranking is typically the figure that better reflects real-world user experience. That number for Pro will only become meaningful one to two weeks after GA launch, once the user community has had actual hands-on time.
There is also a concrete engineering question that matters independent of benchmarks: does quality degrade as context length approaches 2M tokens? Existing models in the 128K-200K range show measurable drops in retrieval precision beyond 80% context fill. If Google has not addressed this at the architecture level, the effective usable range could be significantly lower than the headline number.
The delay also has a competitive cost. Claude Fable 5 moved behind a paywall on June 23. Enterprise buyers reassessing their vendor mix are making decisions now, not in July. Every week Pro is unavailable is a week where alternatives get evaluated first.
Signals Worth Watching
June 30 is a verifiable checkpoint: does Gemini 3.5 Pro reach general API availability before the end of the month? If it slips past June, Google will need a new narrative heading into July.
After GA, three specific numbers are worth tracking in the first month. First, LMSYS Chatbot Arena overall Elo: does Gemini 3.5 Pro rank above Claude Opus 4.8 and GPT-4.5? That community-generated figure carries more practical weight than any internal benchmark. Second, independent long-context quality testing: does retrieval precision hold above 100K tokens or does it drop systematically? This requires external research validation, not Google’s own results. Third, enterprise procurement signals: where Fortune 500 AI spend concentrates in the Q3 reporting cycle is a proxy for Pro’s real-world adoption curve.
If June 30 passes without a GA release, the delay compounds against an already difficult month for Google DeepMind. Two prominent researchers left for competitors inside a week. Rival frontier models shipped on schedule. The longer Gemini 3.5 Pro is absent from the GA market, the harder the explanation becomes.
If this was useful, subscribe to the newsletter for weekly AI PM insights and GenAI case studies.
Related Articles
Google Antigravity CLI Arrives as Gemini CLI Shuts Down
Google Antigravity CLI officially replaces Gemini CLI today, cutting off free users immediately. An Apache 2.0 open-source tool absorbed into a closed platform, completing the fully proprietary AI coding tool market.
Noam Shazeer Leaves Google for OpenAI: Transformer Co-Author Defects Ahead of IPO
Noam Shazeer, co-author of the foundational transformer paper and Google Gemini co-lead, announced he is joining OpenAI. Google spent $2.7B to bring him back from Character.AI just two years ago. His departure is a significant blow to Gemini ahead of OpenAI's September IPO.