AI and D - July 7, 2025

Gemini Robotics On-Device

Jul 07, 2025

Google DeepMind’s latest breakthrough, Gemini Robotics On-Device, marks a pivotal shift in how robots think, act, and adapt—without needing the internet. Unveiled on June 24, 2025, this AI model runs locally on robotic hardware, bringing cloud-level intelligence to edge devices. It’s optimized for bi-arm robots, enabling them to perform complex tasks like folding clothes, unzipping bags, and assembling belts—all with low-latency inference and minimal computational resources.

What makes this noteworthy is its general-purpose dexterity and ability to generalize across tasks and embodiments, even in zero-connectivity environments. This eliminates the Achilles' heel of internet dependency that has limited robotics applications in remote locations, industrial settings, or areas with unreliable connectivity. The model demonstrates remarkable dexterity, performing complex tasks like unzipping bags, folding clothes, and even industrial belt assembly with precision.

Unlike traditional models, Gemini On-Device can be fine-tuned with just 50 to 100 demonstrations, making it ideal for rapid deployment in real-world scenarios. It’s the first VLA (Vision-Language-Action) model from DeepMind available for developer fine-tuning, supported by a new Gemini Robotics SDK and the MuJoCo physics simulator.

Why it matters? This development addresses two fundamental robotics challenges: latency and connectivity. This model empowers robots to operate autonomously in latency-sensitive or privacy-critical settings—think disaster zones, healthcare, or remote manufacturing. It also outperforms previous on-device models in multi-step instructions and out-of-distribution tasks, rivaling its cloud-based counterpart in performance.

What's new and different is that Gemini Robotics On-Device represents the first Vision Language Action (VLA) model available for fine-tuning, democratizing advanced robotics AI. The model has been successfully adapted across different robot embodiments, from ALOHA robots to Franka FR3 systems and even Apollo humanoid robots. This cross-platform adaptability, combined with the newly released Gemini Robotics SDK, creates an ecosystem where developers can rapidly prototype and deploy robotics solutions.

The model's generalization capabilities extend beyond its training scope, handling previously unseen objects and complex multi-step instructions while maintaining safety through semantic and physical safety protocols.

In short, Gemini Robotics On-Device is Google’s bold step toward decentralizing AI, making robots smarter, faster, and truly independent.

🔑 Key Points

Released June 24, 2025 by Google DeepMind.
Runs fully offline with low-latency, high-dexterity performance.
Requires only 50–100 demonstrations for task adaptation.
Supports bi-arm robots like Franka FR3 and Apollo humanoid.
Comes with a developer SDK and MuJoCo simulator.
Outperforms previous on-device models in generalization and instruction-following.

AI and D

AI and D - July 7, 2025

Gemini Robotics On-Device

🔑 Key Points

Discussion about this post