New Research Suggests Consistency, Not Complexity, Is the Key to Teaching Robots Dexterity
Robotic hands sit on display at the NYU Center for Robotics and Embodied Intelligence.
Teaching robots to manipulate objects with humanlike dexterity has long been one of robotics’ toughest challenges. Tasks such as rotating an object in-hand or coordinating two robot arms to maneuver a bulky item require constant changes in contact, grip and motion, skills that are difficult both to program and to demonstrate through human teleoperation.
Now researchers from NYU Tandon and the Robotics and AI Institute have shown that robots may be able to learn these behaviors from planning algorithms instead of human demonstrations. Their study, published in IEEE Robotics and Automation Letters (RA-L), suggests that the quality of synthetic training data matters more than researchers previously realized. The paper was recently awarded the IEEE RA-L Best Paper Award.
Modern robot-learning systems often rely on imitation learning, in which robots copy demonstrations collected from humans controlling robotic hardware remotely. But teleoperation systems are poorly suited for highly dexterous tasks involving many simultaneous contact points and finger movements. To bypass that limitation, the researchers used motion-planning algorithms to automatically generate demonstrations inside physics simulations. The idea was to let robots learn from virtual experience rather than from people. But the team discovered a problem: popular planning systems known as rapidly exploring random trees, or RRTs, produce demonstrations that are too inconsistent.
“These planners are very good at finding solutions,” says lead author Huaijiang Zhu. “But when every solution looks different, the learning system struggles to figure out what behavior it should imitate.”
The team found that the planners’ randomness created what researchers call “high-entropy” data — demonstrations that solved the same task through wildly different motions. Although this diversity helps planners explore possible solutions, it makes imitation learning less effective.
To address the issue, the researchers developed alternative planning approaches that generated more consistent demonstrations. One method emphasized steady progress toward a goal rather than random exploration, while another reused a library of predefined motions to reduce variability.
The researchers tested the approach on two difficult manipulation problems. In one task, two robot arms had to rotate a large cylinder by 180 degrees while repeatedly changing their grips. In another, a dexterous robotic hand manipulated a cube in its palm to match target orientations. They found that robots trained on the more consistent demonstrations achieved much higher success rates than those trained on standard RRT-generated data, even when using relatively small datasets. In the dual-arm task, the improved system reached near-perfect performance with only 100 demonstrations.
The team also transferred the learned policies directly from simulation to real-world hardware without additional retraining. The dual-arm robot succeeded in 90 percent of physical trials, while the dexterous hand completed about 62 percent of its attempts.
The study highlights a growing shift in robotics research. Rather than treating classical motion planning and machine learning as separate approaches, scientists are increasingly combining them. In this case, planning algorithms effectively served as teachers for neural-network-based robot policies.
The findings also reinforce a broader lesson emerging across AI: more data is not always better. Carefully structured, consistent examples may teach machines more effectively than large quantities of noisy or highly variable demonstrations.
Challenges remain, particularly for tasks involving deformable objects or soft robotic hands that are difficult to simulate accurately. But the work suggests a future in which robots learn increasingly sophisticated physical skills from virtual environments designed not just to produce solutions, but to produce solutions machines can understand.
H. Zhu et al., "Should We Learn Contact-Rich Manipulation Policies From Sampling-Based Planners?," in IEEE Robotics and Automation Letters, vol. 10, no. 6, pp. 6248-6255, June 2025, doi: 10.1109/LRA.2025.3564701.