Inside the making of Gemini 3 - how Google's slow and steady approach won the AI race (for now)
Publish Time: 26 Nov, 2025
gettyimages-2246856635
NurPhoto / Contributor / NurPhoto via Getty

Follow : Add us as a preferred source  on Google.


When I walked into a conference room in Google's San Francisco building last week, I expected to find the typical tech briefing setup with rows of chairs facing a wall of screens and a corporate voice managing a slide deck. 

Instead, I found myself in what looked more like group therapy with a large circle of cozy chairs arranged around the center of the room. About a dozen carefully selected testers and creators, including myself, sat down with the team behind Gemini 3, which had just gone public, and Nano Banana Pro, which would debut the next day.

Also: Google's Gemini 3 is finally here and it's smarter, faster, and free to access

That rapid release schedule couldn't have been more telling. The AI industry is in the midst of an unprecedented race, with OpenAI, Anthropic, Google, and others entrenched in a constant scramble to capture user attention and prove their models deliver more value than the rest. 

With Tulsee Doshi (senior director and head of product for Gemini Models), Logan Kilpatrick (group PM lead for Gemini API), and Nicole Brichtova (product lead for image & video) sitting across from me, I got a fascinating look at the decisions, tradeoffs, and challenges behind these high-profile launches.

Here are three details that stood out during our 75-minute conversation.

Why Gemini 3 took longer than expected

The gap between Gemini 2.5 Pro's debut at Google I/O in May and Gemini 3's arrival in November felt significant, especially given the rapid pace of AI development across the industry. When the topic of timeline came up, Doshi explained that the delay came down to a two-pronged approach.

On the pre-training side, the team set ambitious goals around reasoning performance and multimodality. They wanted "state-of-the-art reasoning" with real "nuance and depth." But the bigger factor was post-training work focused on usability improvements like better tool use and refining the model's persona based on extensive feedback they gathered from the 2.5 release.

The team had learned a hard lesson from their previous experimental model release strategy. 

Also: Want to ditch ChatGPT? Gemini 3 shows early signs of winning the AI race

"We had done this sort of experimental model release train multiple times," Doshi said, "and a lot of the feedback from the developer audience was [that] this caused a lot of churn for people." Developers were waking up every morning to find things drastically different. That required them to test new experimental Gemini models, which carried with them a "true cognitive and time cost."

This time, they took a different approach. "We spent a much longer iteration cycle of giving the model to folks, getting feedback, using that feedback to iterate on the models more, doing that round a few times," Doshi explained. The last few weeks became an intense sprint of triaging issues, identifying whether problems were in serving or the model itself, and fixing what they could.

Also: Google's Nano Banana image generator goes Pro - how it beats the original

Kilpatrick added that coordinating launches across multiple Google services created an extra layer of complexity. "It's really difficult to get all of Google on the same page, and spin up the infrastructure to support this model for hundreds of millions of customers," he said. The goal was to simultaneously ship across the Gemini app, Google Search, and AI Studio, which required far more coordination than previous launches.

The philosophy driving these decisions was clear: "We try not to be as date-driven as we try to be quality-driven," Doshi noted. The team wanted to avoid shipping an unpolished product and essentially testing and iterating in public. Instead, they opted to do it behind closed doors.

Gemini 3 is helping build Gemini 4

Doshi went on to say, "The volume of feedback that came in was almost actually more than we even could properly manage." 

As I sat there, it occurred to me that I had a sense of what might be helpful in that case, so I asked, "How much, if at all, do you use the Gemini model to analyze and understand the success of the Gemini model?" To my surprise, Doshi's response was immediate: "A lot, actually. It's been actually pretty awesome."

The team uses Gemini extensively to cluster feedback and identify patterns from the massive influx of user reports. But Doshi was careful to note an important balance. "I want a lot of our teams to build empathy, and some of that empathy goes away if you abstract too high." If Gemini fully abstracts the feedback, teams might lose touch with the actual pain points users are experiencing. So they use Gemini to help find the patterns, but keep the team reading real user feedback so they stay close to their frustration.

Also: Google just rolled out Gemini 3 to Search - here's what it can do and how to try it

Beyond analyzing feedback, they're also using Gemini to write tools that speed up their testing process. Kilpatrick's team has taken this even further on the product side. 

"We're more so continuously coding using Gemini 3, which has been a huge accelerant of making the UI better," he said. Taking it one big step further, Kilpatrick added, "Gemini 4 is going to be created by Gemini 3. Maybe some of the product experiences of how you interact with Gemini 4 are being created right now by Gemini 3."

Doshi was quick to add, "I don't know that I would go as far as to say Gemini built Gemini, but I think it's very close to how we take all of these various pieces and have Gemini accelerate."

Text rendering finally works (mostly)

One of Nano Banana Pro's most impressive improvements is something that has taken AI a long time to master. The text in AI-generated images actually looks accurate now.

Nicole Brichtova walked us through examples of infographics created with remarkably simple prompts. 

When looking at these examples on the big screen in the room, I found myself scrutinizing every word, searching for the obvious signs of AI-generated text, any misspellings, made-up words, and seemingly alien nonsense characters that have plagued image generation models to date. To my surprise, this incredibly complex infographic was spotless.

The improvement in what Brichtova called the "cherry-pick rate" has been dramatic from the previous version of Nano Banana. "It used to be that you had to generate 10 of these and then maybe one of them actually had perfect text," she said. "And now you'll make 10 and maybe one or two of them you can't actually use."

Also: I tried NotebookLM's new visual aids - it said I went to 'Borkeley'

What made the progress even more striking was how sophisticated the failures had become. Doshi mentioned looking at examples from a couple of months earlier where errors were obvious, but more recently, she questioned whether seemingly real words were real at all. "It looked legitimate, like it wasn't funny or anything -- but no, it was not a real word." The model had gotten so good that it could create convincing fake words that looked like they belonged in the English language.

One tester in the room shared their experience using Nano Banana Pro to generate an infographic from a research paper. The first attempt worked beautifully, and the first couple of iterations to refine it went well. But by the fifth round of edits, things fell apart, and the model started making up words and even dropping in fragments of other languages.

Brichtova acknowledged this as a known limitation. "Multi-turn is something that we continue to get better at," she said. "After you get into turn three, you basically have to reset your conversation. The longer you have a conversation with this model, the more it can fall apart." She emphasized it's an area they're actively working on, though for single-shot generation, the quality has reached an impressive level.

Little time to celebrate

Jason Howell rendered by Nano Banana Pro

An AI Jason Howell in a Nano Banana Pro sweater. 

Jason Howell/

After 75 minutes of candid conversation, I joined the group for a few hands-on demos of both Gemini 3 and Nano Banana Pro. One moment that stood out for me was seeing Nano Banana Pro generate images of my face with notable accuracy. I've tested plenty of image generators, but this was the first time I had trouble distinguishing the AI-generated version from a real photo. The adherence to my actual facial features was spot on, and the holiday sweater was a nice bonus, too.

What struck me most, though, wasn't just the technology being shown off but the mood in the room. Despite Gemini 3's successful launch the day before and the obvious excitement around Nano Banana Pro's release to follow, there was a notable hesitation among the team to celebrate too early.

Given how positively people had responded to both Gemini 3 and the viral success of the original Nano Banana, I thought Nano Banana Pro was a slam dunk. However, the team wasn't ready to give themselves high-fives just yet. They wanted to see the launch land successfully first. And even then, the celebration would be brief, because the breakneck pace of AI development meant they'd need to get right back on that treadmill to prepare for the next release.

In an industry where companies race to ship the next big model, Google's approach stood out for its willingness to delay for quality, iterate based on concrete feedback, and use its own AI to build better AI. Perhaps most telling, however, was witnessing a team that, even after a major win, understood there's little time to rest.

I’d like Alerts: