MS-TCRNet: Multi-Stage Temporal Convolutional Recurrent Networks for action segmentation using sensor-augmented kinematics

Action segmentation is a challenging task in high-level process analysis, typically performed on video or kinematic data obtained from various sensors. This work presents two contributions related to action segmentation on kinematic data. Firstly, we introduce two versions of Multi-Stage Temporal Co...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition 2024-12, Vol.156, p.110778, Article 110778
Hauptverfasser: Goldbraikh, Adam, Shubi, Omer, Rubin, Or, Pugh, Carla M., Laufer, Shlomi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Action segmentation is a challenging task in high-level process analysis, typically performed on video or kinematic data obtained from various sensors. This work presents two contributions related to action segmentation on kinematic data. Firstly, we introduce two versions of Multi-Stage Temporal Convolutional Recurrent Networks (MS-TCRNet), specifically designed for kinematic data. The architectures consist of a prediction generator with intra-stage regularization and Bidirectional LSTM or GRU-based refinement stages. Secondly, we propose two new data augmentation techniques, World Frame Rotation and Hand Inversion, which utilize the strong geometric structure of kinematic data to improve algorithm performance and robustness. We evaluate our models on three datasets of surgical suturing tasks: the Variable Tissue Simulation (VTS) Dataset and the newly introduced Bowel Repair Simulation (BRS) Dataset, both of which are open surgery simulation datasets collected by us, as well as the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a well-known benchmark in robotic surgery. Our methods achieved state-of-the-art performance. code: https://github.com/AdamGoldbraikh/MS-TCRNet. •Two new Multi-Stage Temporal Convolutional Recurrent Networks for kinematic data.•Added Intra-Stage Regularization and RNN-based refinement stages.•Proposed two new data augmentation methods specifically designed for kinematic data.•State-of-the-art performance on action segmentation benchmarks: JIGSAWS and VTS.•Established a strong baseline for the new Bowel Repair Simulation (BRS) dataset.
ISSN:0031-3203
DOI:10.1016/j.patcog.2024.110778