DeepMind unveils Genie 2: AI model creating immersive 3D worlds from text prompts – how it works

Share This Post


DeepMind has introduced Genie 2, an innovative artificial intelligence model capable of generating playable and immersive 3D worlds. Building on its predecessor, Genie, which could transform single images into interactive environments, Genie 2 takes this concept further by crafting dynamic and realistic virtual worlds from text prompts or images.

In a recent blog post, Google’s DeepMind described Genie 2 as a large-scale foundation world model designed to create intricate 3D simulations. A simple prompt like “a warrior in snow” can result in an expansive interactive world where users explore a snowy environment as a warrior character. The generated settings even include physics-based interactions such as jumping, swimming, and object manipulation, all while maintaining realistic lighting effects.

Genie 2’s advanced capabilities stem from its training on a vast dataset of videos, enabling it to generate coherent and visually rich environments. According to DeepMind, the AI can create consistent worlds with varying perspectives — including first-person and isometric views — that last up to a minute, with most spanning 10 to 20 seconds.

The model operates through an auto-regressive process, crafting videos frame by frame based on prior frames and user actions. When given a text or image prompt, Genie 2 works with Imagen3, another generative model, to produce a corresponding visual representation. Users can then navigate and interact with the virtual environment via keyboard inputs.

One standout feature is Genie 2’s action control capabilities. It interprets user commands intelligently, ensuring that pressing directional keys moves a robot character rather than unrelated objects like clouds or trees. Its long-term memory allows it to recall and render previously unseen parts of the world when they reappear, enhancing the continuity and realism of the experience.

While Genie 2 has significant implications for gaming, DeepMind positions it as a creative and research tool. The model’s ability to transform concept art or drawings into interactive environments opens new possibilities for digital art, design, and simulation.

DeepMind also highlights Genie 2’s potential for creating entirely novel video games where characters and worlds could be dynamically generated in real time.



Source link

spot_img

Related Posts

Threads Rolling Out Post Insights, New Markup Tool, and a Scheduling Feature

Threads announced the release of three new features...

Germany cooperating with big tech platforms ahead of elections

The federal government must ''work closely with platforms...

Instagram and Facebook Blocked and Hid Abortion Pill Providers’ Posts

Instagram and Facebook have recently blurred, blocked or...

Why everyone in AI is freaking out about DeepSeek

Join our daily and weekly newsletters for the...

AMD confirms mystery bug that reportedly affects gaming PCs

AMD has confirmed a vulnerability in its processor...

NASA awards Artemis logistics studies

WASHINGTON — NASA issued study contracts to examine...
spot_img