techspot.com

Robots leverage Google's Gemini AI to fold origami from simple instructions

**The big picture:** While companies continue to improve robotic hardware, developing AI software to truly bring these machines to life has remained an elusive goal. This is especially disappointing given the remarkable advancements in "smart" language models. Now, Google's AI research lab has come closer than ever to bridging this gap.

DeepMind has unveiled Gemini Robotics, an evolution of their powerful Gemini 2.0 language model that could unlock new capabilities for robots.

The goal of Gemini Robotics is to create a generalized AI system capable of directly controlling robots and helping them master the trifecta of flexibility, interaction, and dexterity. The result could be robots that adapt to novel situations, respond naturally to humans and their environment, and perform complex physical tasks.

And they're making steady progress. Just check out this video of ALOHA 2, a dual-armed robot from DeepMind, showcasing its skills. Not only can it precisely fold an origami figure, but it can also improvise when things don't go as planned – like when the researcher moved the container it was supposed to place fruit in.

![](https://www.techspot.com/images2/news/bigimage/2025/03/2025-03-13-image-20.jpg)

The best part is that it achieves this with simple instructions like "fold an origami fox." The researchers didn't have to manually program that ability – the robot simply leveraged its understanding of origami and how to fold paper to complete the task.

Of course, origami is just the beginning. DeepMind [claims](https://deepmind.google/technologies/gemini-robotics/) that Gemini Robotics represents a significant leap in all three key robotic abilities compared to their previous work. The AI model more than doubled its performance on general task benchmarks compared to other state-of-the-art systems.

What does this mean? Gemini Robotics could usher in a new generation of robots capable of generalizing and adapting to unpredictable real-world situations without needing tailored training for every scenario. This versatility is essential for developing truly useful, general-purpose robots in the future.

To realize this potential, Google is also collaborating with a company called Apptronik. Apptronik will handle the hardware by building next-gen humanoid robots powered by Gemini.

Don't expect to hire a Gemini Robot butler anytime soon, though. For now, DeepMind is keeping the project in research mode, releasing a "Gemini Robotics-ER" system that will allow "trusted testers" like Boston Dynamics to access the AI's reasoning capabilities for their own projects. The "ER" stands for embodied reasoning.

Trusted testers could include companies like Boston Dynamics, Agility Robotics, and Enchanted Tools.

Of course, real-world robots powered by advanced AI raise important safety concerns. DeepMind says it takes a "holistic" approach inspired by Asimov's laws of robotics and is developing evaluation standards through a new "ASIMOV" dataset. The goal is to test whether AI models understand the broader consequences of robotic actions, beyond just physical harm.

Read full news in source page