Google has unveiled the latest evolution of its AI video generation technology, which includes audio, including background sounds, sound effects, and spoken dialogue.
The model is currently only available in the United States within the Gemini app and for enterprise users on Vertex AI. It’s also available in Flow, Google’s new AI filmmaking tool.
In a blog post, Eli Collins, VP, Google Deepmind, said users can tell a short story in their prompt, and the model will create a clip that brings their story to life.
Content creators have taken to social media to showcase its capabilities.
WE CAN TALK! I spent 2 hours playing with Veo 3 @googledeepmind and it blew my mind now that it can do sound! It can talk, and this is all out of the box… pic.twitter.com/ufplpcZWbq
— Ari K (@arikuschnir) May 20, 2025
Google also announced updates to Veo 2, including a feature that lets users give the model images of characters, scenes, objects, and styles for better consistency.
Veo 2 can understand camera movements like rotations, dollies, and zooms, and it allows users to add or erase objects from videos, added the company.