
Google DeepMind unveils Genie 3, a real-time interactive world model
On August 5, 2025, Google DeepMind unveiled Genie 3, a foundation world model capable of generating interactive 3D environments from text prompts at 720p resolution and 24 frames per second, sustaining a few minutes of continuous user interaction per episode, according to DeepMind's research blog and TechCrunch. The model is positioned as a general-purpose successor to Genie 2 and as training infrastructure for embodied AI agents.
Genie 3 follows DeepMind's earlier Genie line announced in 2024 and Genie 2 in late 2024, both of which generated short, lower-resolution interactive scenes. The 2025 release is the first version DeepMind frames as broad-domain rather than tied to any single environment, with promptable mid-episode world events — weather changes, new objects, additional characters — handled in real time, per the company's release notes and TechCrunch.
Shlomi Fruchter, a research director at DeepMind, told TechCrunch that "Genie 3 is the first real-time interactive general-purpose world model. It goes beyond narrow world models that existed before. It's not specific to any particular environment. It can generate both photo-realistic and imaginary worlds, and everything in between." Research scientist Jack Parker-Holder added that "we think world models are key on the path to AGI, specifically for embodied agents, where simulating real world scenarios is particularly challenging," in remarks reported by TechCrunch and Yahoo Finance.
Technically, Genie 3 produces frames auto-regressively, conditioning each new frame on the trajectory of prior frames to maintain spatial and temporal consistency, according to DeepMind's blog. Episodes run at 24 fps and 720p; episode length is described as "a few minutes" — markedly longer than the seconds-scale outputs of Genie 2. The model is offered in research-preview form rather than as a public API, and access is gated behind DeepMind's research partnerships and the SIMA agent-training pipeline.

Press reaction emphasized two threads. TechCrunch and Wired framed the model as a credible step toward simulation-based training of embodied agents, citing DeepMind's framing of world models as a path to AGI. Independent analysts including Ben Dickson at BD Tech Talks cautioned that interactive episodes still drift over multi-minute spans and that the model is not yet a substitute for game engines or physics simulators in production training pipelines.
For us at Enpo Sekai, Genie 3 is not a tool we expect to put into a shipped product in the near term — our work is in characters and persona, not procedurally generated environments. But the same underlying capability shifts what a small character studio can prototype. A real-time world model that responds to a character's behavior gives us a cheaper way to test how a persona reads under different settings, lighting, and crowd density before we commit to a stage in a real game build.
We will be watching three things over the next twelve months: (1) whether DeepMind opens Genie 3 access beyond research partnerships, and on what licensing terms, since that determines who outside Big Tech can actually prototype with it; (2) how episode length and consistency improve, since "a few minutes" is still well short of game-session scale; (3) whether competing labs — particularly NVIDIA, Runway, and Chinese players — release equivalent or stronger general-purpose world models, which would force convergence on common evaluation benchmarks for this category.


