CLUE-AI: A Convolutional Three-stream Anomaly Identification Framework for Robot Manipulation
Robot safety has been a prominent research topic in recent years since robots are more involved in daily tasks. It is crucial to devise the required safety mechanisms to enable service robots to be aware of and react to anomalies (i.e., unexpected deviations from intended outcomes) that arise during...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Robot safety has been a prominent research topic in recent years since robots
are more involved in daily tasks. It is crucial to devise the required safety
mechanisms to enable service robots to be aware of and react to anomalies
(i.e., unexpected deviations from intended outcomes) that arise during the
execution of these tasks. Detection and identification of these anomalies is an
essential step towards fulfilling these requirements. Although several
architectures are proposed for anomaly detection, identification is not yet
thoroughly investigated. This task is challenging since indicators may appear
long before anomalies are detected. In this paper, we propose a ConvoLUtional
threE-stream Anomaly Identification (CLUE-AI) framework to address this
problem. The framework fuses visual, auditory and proprioceptive data streams
to identify everyday object manipulation anomalies. A stream of 2D images
gathered through an RGB-D camera placed on the head of the robot is processed
within a self-attention enabled visual stage to capture visual anomaly
indicators. The auditory modality provided by the microphone placed on the
robot's lower torso is processed within a designed convolutional neural network
(CNN) in the auditory stage. Last, the force applied by the gripper and the
gripper state are processed within a CNN to obtain proprioceptive features.
These outputs are then combined with a late fusion scheme. Our novel
three-stream framework design is analyzed on everyday object manipulation tasks
with a Baxter humanoid robot in a semi-structured setting. The results indicate
that the framework achieves an f-score of 94% outperforming the other baselines
in classifying anomalies that arise during runtime. |
---|---|
DOI: | 10.48550/arxiv.2203.08746 |