Garp Independent AI & technology journalism
Tuesday, June 23, 2026 Sign In · Join Subscribe
Latest Google Deepmind and A24 team up on AI filmmaking research

AI news, research, models, robotics, chips, startups, and infrastructure coverage.

Updated daily

Home  /  Robotics  /  Nvidia research shows robots that train themselves through AI coding agents

Robotics

Nvidia research shows robots that train themselves through AI coding agents

Nvidia research shows robots that train themselves through AI coding agents

Researchers published new findings on Nvidia research shows robots that train themselves through AI coding agents: researchers from Nvidia, Carnegie Mellon University, and UC Berkeley are using AI coding agents to teach robots dexterous grasping in the real world. A fleet of eight robots hits up to 99 percent success on tricky tasks.

That manual overhead slows everything down. ENPIRE, a research project from Nvidia, Carnegie Mellon University, and UC Berkeley, aims to break through that bottleneck by handing the work to AI coding agents. The core idea is a feedback loop running on real hardware: reset the workspace, run a strategy, check the result, and improve the next attempt. ENPIRE runs in two phases. In the first, the agent sets up a working environment with some human feedback. That includes safety boundaries, an automatic reset, and automated success checking. Instead of having a human evaluate every attempt, the agent writes its own reward function to tell success from failure. It only needs a few minutes of example video showing successful and failed attempts. For pin insertion, for example, the agent developed a check combining visual alignment, gripper height, and estimated force. For closing a cable tie, it combined two camera angles to avoid false positives and pushed reaction time below 150 milliseconds. These tools get built once and reused without changes. In the second phase, the agent works entirely on its own. It reads research papers, forms hypotheses, and edits the training code directly. It uses methods like behavior cloning, where the strategy mimics human demonstrations, or reinforcement learning, where the strategy improves through trial and error. The agent picks the method itself based on real-world success signals.