Involution fused convolution for classifying eye-tracking patterns of children with Autism Spectrum Disorder
Autism Spectrum Disorder (ASD) is a neurological condition that is challenging to diagnose. Numerous studies demonstrate that children diagnosed with autism struggle with maintaining attention spans and have less focused vision. The eye-tracking technology has drawn special attention in the context...
Gespeichert in:
Veröffentlicht in: | Engineering applications of artificial intelligence 2025-01, Vol.139, p.109475, Article 109475 |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Autism Spectrum Disorder (ASD) is a neurological condition that is challenging to diagnose. Numerous studies demonstrate that children diagnosed with autism struggle with maintaining attention spans and have less focused vision. The eye-tracking technology has drawn special attention in the context of ASD since anomalies in gaze have long been acknowledged as a defining feature of autism. Deep Learning (DL) approaches coupled with eye-tracking sensors are exploiting additional capabilities to advance the diagnostic and its applications. DL architectures like convolution have been dominating this domain. However, convolutions alone may be insufficient to capture the important spatial information in eye-tracking patterns, as these patterns are more likely to be localized. The dynamic kernel-based process known as involutions can improve the classification efficiency. In this study, we utilize two image-processing operations to see how these processes learn eye-tracking patterns. Since these patterns are primarily based on spatial information, we employ a hybrid of involution and convolution. Our study shows that adding a few involution layers reduces size and computational cost while enhancing location-specific capabilities, maintaining performance comparable to pure convolutional models. However, excessive involution layers lead to weaker performance. For comparisons, we experiment with two datasets and a combined version of both in ablation studies. Our proposed model, featuring three involution layers and three convolution layers, achieved 99.43% accuracy on the first dataset and 96.78% on the second, with a size of only 1.36 megabytes. These results showcase the effectiveness of combining involution and convolution layers which outperforms previous literature. |
---|---|
ISSN: | 0952-1976 |
DOI: | 10.1016/j.engappai.2024.109475 |