Capsule networks for computer vision applications: a comprehensive review

Convolutional neural networks (CNNs) have achieved human-level performance in various computer vision tasks, such as image classification, object detection & segmentation, etc. However, efficient CNN training requires a large amount of annotated data. Also, the CNNs, without explicit data augmen...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied intelligence (Dordrecht, Netherlands) Netherlands), 2023-10, Vol.53 (19), p.21799-21826
Hauptverfasser: Choudhary, Seema, Saurav, Sumeet, Saini, Ravi, Singh, Sanjay
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Convolutional neural networks (CNNs) have achieved human-level performance in various computer vision tasks, such as image classification, object detection & segmentation, etc. However, efficient CNN training requires a large amount of annotated data. Also, the CNNs, without explicit data augmentation, are bad at handling rotation and scale invariance. Besides, these neural networks do not learn the important spatial correlations between simple and complex objects in images. Recently, researchers introduced Capsule Network (CapsNet) to overcome the limitations of CNNs. CapsNet uses vector activation functions where the vectors’ length and orientation represent the entities’ existence and properties. Recent advances in the routing algorithms of CapsNets have increased their usefulness in solving complex computer vision problems. One can gauge their importance from numerous recently published articles in top-rank conferences and journals. Also, researchers have published a few review articles that discuss the structural and implementation details of CapsNets. This review focuses on the applications of CapsNet in computer vision. We first present a brief note on CNNs and their limitations, followed by basic structural and implementation details of CapsNet, including routing algorithms. Subsequently, the study investigates details of CapsNet variants that have evolved in recent years and their applications in different computer vision tasks. Finally, the paper presents a short commentary on the advantages, disadvantages, and limitations of CapsNet and outlines future research directions in the area of CapsNet.
ISSN:0924-669X
1573-7497
DOI:10.1007/s10489-023-04620-6