Teaching robots complicated manipulation skills through observation of human demonstrations has shown promising results. Providing extensive manipulation demonstrations is time-consuming and labor costly, making it challenging to scale up this paradigm to real-world long-horizon operations. However, not all facets of a task are created equal.
A new study by NVIDIA and Georgia Institute of Technology explores approaches to enhancing Task and Motion Planning (TAMP) systems, which have proven particularly effective in resolving problems with a wide range of possible future outcomes. By exploring all possible permutations of a limited set of primitive abilities, TAMP approaches can plan behavior for various multi-step manipulation tasks. Each skill is traditionally hand-engineered, but closing a spring-loaded lid or inserting a rod into a hole are two examples of tasks that are extremely challenging to model efficiently. Instead, the team leverages human teleoperation with closed-loop learning to incorporate only the necessary abilities while leaving the rest to automation. These capabilities rely on human teleoperation during data collecting and a policy learned from the collected data during deployment. There are significant technological hurdles associated with integrating TAMP systems with human teleoperation, and special attention must be paid to ensuring a smooth handoff between them.
To overcome these obstacles, they provide Human-in-the-Loop Task and Motion Planning (HITL-TAMP), a system that integrates TAMP and teleoperation in a complementary fashion. The TAMP-gated control mechanism used by the device allows for demonstration collection by switching between a TAMP system and a human teleoperator. Importantly, the TAMP system prompts human operators to participate only at specific points in a work plan so they can manage a fleet of robots by asynchronously engaging with one demonstration session at a time. The technique dramatically improves the throughput of data collection. It reduces the effort required to collect huge datasets on long-horizon, contact-rich jobs by only asking for human demonstrations when they are needed. To train a TAMP-gated strategy using human data, they integrate their data-gathering system with an imitation learning framework. In terms of the data required to teach a task to the robot, the time required to teach the task, and the success rate of taught policies, they show that this leads to greater performance than gathering human demonstrations of the complete task.
The researchers tested HITL-TAMP against a standard teleoperation system with 15 participants. With their method, users could acquire more than three times as many demonstrations simultaneously. Just 10 minutes of data from non-expert teleoperation could be used to train agents with over 75% success. HITL-TAMP frequently generates near-perfect agents by collecting 2.1K demonstrations spanning 12 contact-rich and long-horizon tasks, such as real-world coffee brewing.
When compared to gathering human demonstrations on the full work, the data collection and policy learning efficiency in HITL-TAMP is greatly enhanced by the combination of TAMP and teleoperation.
Check out the Paper and Project. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 32k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.