EasyTeaching

Few Shot Imitation Learning: A Keyframe-Driven Framework for Robotic Manipulation Learning

Empowering Limited Human Demonstrations to Enable Versatile Robot Training

Overview

The EasyTeaching framework is designed to empower robots to learn complex manipulation tasks using a limited number of human-operated demonstrations. Developed in 2021, the approach aims to simplify robot teaching for NON-EXPERTS while overcoming common challenges such as noisy data, exploration inefficiencies, and the scarcity of demonstration episodes.

The overview of the proposed method is shown in following figure.

Motivation

The core motivations behind EasyTeaching include:

Defining Trajectory Tasks

Trajectory tasks are those in which a robot must follow a specific path defined by a series of states or key points. For example, the pick-and-place task is a specialized form of trajectory task. Although human teleoperation provides a natural way to generate these paths, several challenges arise in practice. The following figures illustrate a few trajectory tasks.

To elabrate, the pick and place task can be treated as a special case of trajectory task.

Challenges and Proposed Solutions

Challenges

  1. Noisy Demonstration Data:
    • Human Factors: Operators have personal biases, leading to non-uniform movements.
    • Inherent Noise: The demonstrated trajectories often include extraneous or suboptimal actions.
  2. Inefficient Random Exploration:
    • High Dimensionality: Many constraints make brute-force exploration computationally expensive.
    • Low Success Rate: The probability of stumbling upon a feasible trajectory randomly is extremely low.
  3. Limited Demonstration Episodes:
    • Cost of Collection: Amassing a large dataset of demonstrations is both time-consuming and expensive.

Solutions

To tackle these challenges, EasyTeaching introduces a multi-faceted approach:

Methodology

1. Keyframe Extraction and Evaluation

The process begins with modeling the task as a shortest path problem—from the initial state to the goal state—using dynamic programming-based reinforcement learning. Three types of data points are considered:

Note: The misalignment between operator inputs and robot trajectories highlights the need for refining human demonstrations to better suit robotic control.

2. Reinforcement Learning Framework

The dual-policy framework comprises:

Both policies benefit from a latent space module that fuses state, subgoal, and goal representations, thereby simplifying the decision-making process.

3. Latent Space Generation

The latent space module is trained with a VAE on a dataset that includes both human demonstration and robot exploration data. This transformation:

Teleoperation System: The Foundation of Data Collection

Purpose and Importance The teleoperation system is a critical component of the EasyTeaching framework, serving as the primary means of collecting demonstration data. Its design ensures that even non-experts can effectively guide robots through tasks, making the data collection process more accessible and efficient.

System Features

User-Friendly Interface: Designed for intuitive control, allowing operators to guide robots through tasks with minimal training.

Real-Time Feedback: Provides immediate visual and haptic feedback to operators, enhancing control precision.

Data Logging: Captures comprehensive data, including robot trajectories, sensor readings, and operator inputs, essential for training robust models.

System Build

To elabrate our teleoperation system, the detailed desgin based on ROS structure is showing in following picture. Our system contains a VR system(HTC VIVE), a monitoring sensor (real-sense D450i), a powerfull computing unit equipt with RTX Titan, and an AUBO i3 robot. The real setup is showing in following picture.

The control operation biopise in showing in following images.

Experimental Evaluation

Application to Excavation Tasks The framework was validated through a series of experiments on an excavation task. Two key phases were tested:

  1. Human-Operated Demonstrations: Operators guided the robot to perform the task, providing the initial demonstration data.

  2. Autonomous Robot Operation: The trained policies were then deployed for autonomous operation.

Results:

Human operated task demonstration

Trained operation by robot

The success rate is of our method is shown below:

We also did ablation study on the hyperparameters (the length of the encoded latent space):

Future Directions

Further enhancements to EasyTeaching include:

Published Work

The work has been submitted to the Journal of Computing in Civil Engineering under the title:

Teleoperation-Driven and Keyframe-Based Generalizable Imitation Learning for Construction Robots

paper

Teleoperation-Driven and Keyframe-Based Generalizable Imitation Learning for Construction Robots

Abstract

The construction industry faces challenges with low productivity and high injury rates. Robots can improve these issues by automating processes. However, teaching robots to perform complex tasks is difficult. We present a framework that uses human teleoperation data to train robots for repetitive construction tasks. First, we developed a teleoperation method and interface to control robots on construction sites. Second, we propose a method to extract keyframes from human operation data, reducing noise and redundancy in the training data. Third, we model the robot’s visual observations of the working space to improve learning performance and reduce computational load. We validated our framework by teaching a robot to generate trajectories for excavation tasks using human operators’ teleoperations. Results show that our method outperforms existing approaches, demonstrating its potential for application.