Google replaces Google assistant, Siri to use Gemini and more
Check out the platform that's making it easier for builders to ensure their AI agents never fail a customer by connecting them to real-time human help
Welcome to Voice AI Weekly!
If you are new here, every week we share the biggest news of the Voice and AI space that have hit our radar. We surf the web so you don’t have to.
In today’s weekly roundup we have:
Google replaces Google assistant with Gemini
Apple to integrate Gemini model into Siri
Anthropic’s new funding
Tesla’s new voice AI integrations in China
♊Google replaces Google Assistant with Gemini
What is it?
Google is replacing Google Assistant with Gemini for Home, a new LLM based assistant, and shipping a full refresh of its Nest hardware line to support it.
The details:
The new assistant is built on Google's Gemini models, designed to handle complex, multi step commands and natural language queries.
It enables complex, natural language commands and continuous conversation without repeated wake words through Gemini Live.
The launch includes new hardware: a Google Home speaker and upgraded Nest cameras built to support the new agent.
It will roll out in early access on October 1, 2025, alongside a rebranded "Google Home Premium" subscription.
What it means for the voice AI industry:
Consumer expectations for voice interaction are shifting permanently. Static, command-based agents are now legacy technology.
The new baseline is a fluid, interruptible, and conversational experience. Google's Gemini Live, balancing on-device and cloud processing, is a critical case study for building these real-time agents.
This move sets a new standard, raising user expectations everywhere and compelling the entire industry to build more responsive and capable voice systems.
🍎Apple to integrate Gemini model into Siri
What is it?
Apple will reportedly integrate Google's Gemini into Siri to handle complex web queries and summarization. The model will run on Apple's private cloud, not Google's, with a planned release in spring 2026.
The details:
The integration is part of a new feature called "World Knowledge Answers" launching with iOS 26.4 which will release in spring 2026.
Apple's proprietary models will process sensitive on-device data like messages and calendar events, while the Gemini model handles web search, summarization, and complex reasoning.
Gemini will operate on Apple's Private Cloud Compute, ensuring no user data is stored by Google and that all processing remains within Apple's ecosystem.
This move addresses Siri's well-documented technical limitations and its inability to compete with the capabilities of modern LLMs.
What it means for the voice AI industry:
This is a major validation for a modular, hybrid approach to building agents. Even resource-rich Apple has concluded that a stack of best-in-class components is superior to a single monolithic model.
Their use of private cloud compute to run a third-party model sets a new privacy standard, proving external models can be leveraged without compromising user data.
For developers, the key question is integration: can a system of distinct models, each with unique data and responses, provide a coherent user experience, or will the seams between the on-device and cloud brains always show?
⚡Tesla integrates with Deepseek and Bytedance for in-car voice assistant
What is it?
Tesla is deploying a dual LLM architecture from DeepSeek and ByteDance for its in-car voice assistant in China.
The details:
This integration uses a dual-model architecture: DeepSeek's LLM for conversational chat and ByteDance's Doubao for low-latency vehicle commands.
This split enables complex reasoning alongside reliable, fast execution of core functions like navigation and climate control.
Because Tesla cannot deploy its US-trained models in the region, this partnership was essential to counter its declining market share against local competitors offering advanced voice assistants as standard.
What it means for the voice AI industry:
A global voice product requires a modular architecture. Data sovereignty forces companies to plug in different regional models and infrastructure.
A dual AI approach is also key. Tesla separated high-reasoning tasks from low-latency control. This split-brain architecture proves a single general-purpose model is often not optimal for complex applications.
For builders, high-performance voice interaction is now table stakes. The key question is whether to build with a single provider or assemble a best-in-class stack for specific tasks.
Poku Labs: A platform providing human-in-the-loop tools that enable AI agents to seamlessly request real-time help from humans during customer interactions.
The Problem:
Fully autonomous agents are a black box. They fail silently on edge cases or when they need human approval, and by the time you see the error in a log, the customer is already gone.
The Build:
Poku Labs built a human-in-the-loop tools for AI agents, enabling them to escalate to a human for input or approval in real time. This unlocks high-stakes use cases that AI agents can’t reliably handle alone.
If you're attending Vapicon, be sure to check out their workshop on human-in-the-loop tools.
Google avoids Chrome divestiture in antitrust ruling, now faces new rules on search distribution and data sharing [Read more]
Anthropic raises $13B at a $183B valuation to scale enterprise adoption and fund safety research [Read more]
xAI ships Grok Code Fast 1 for high frequency agentic coding, trading peak reasoning for speed and lower cost [Read more]
OpenAI ships new updates to its Codex as a full software agent for autonomous development in IDEs, the CLI, and GitHub [Read more]
Want to get your company featured in Built On Vapi section?
Fill up this form and we will pick 1 new startup form the submissions to feature here every week.
Also feel free to just reply to this email with suggestions (we read everything you send us)!