Guest Editorial: Special issue on media convergence and intelligent technology in the metaverse

The model uses the lightweight MobileNetV2 as the backbone network for feature hierarchical extraction and proposes an Attentive Pyramid Spatial Attention (APSA) module compared to the Attenuated Spatial Pyramid module, which can increase the receptive field and enhance the information, and finally...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	CAAI Transactions on Intelligence Technology 2023-06, Vol.8 (2), p.285-287
Hauptverfasser:	Ma, Siwei, Gong, Maoguo, Qi, Guojun, Tie, Yun, Lee, Ivan, Li, Bo, Jin, Cong
Format:	Artikel
Sprache:	eng
Schlagworte:	Accuracy Algorithms Amplitudes Artificial intelligence Clustering Communication Datasets Decoding Deep learning Emotions Feature extraction Internet Mass media industry Methods Modules Multimedia communications Neural networks Performance evaluation Semantic segmentation Semantics Technology Virtual reality
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The model uses the lightweight MobileNetV2 as the backbone network for feature hierarchical extraction and proposes an Attentive Pyramid Spatial Attention (APSA) module compared to the Attenuated Spatial Pyramid module, which can increase the receptive field and enhance the information, and finally adds the context fusion prediction branch that fuses high-semantic and low-semantic prediction results, and the model effectively improves the segmentation accuracy of small data sets. The experimental results on the CamVid data set show that compared with some existing semantic segmentation networks, the algorithm has a better segmentation effect and segmentation accuracy, and its mIOU reaches 75.85%. [...]to verify the generality of the model and the effectiveness of the APSA module, experiments were conducted on the VOC 2012 data set, and the APSA module improved mIOU by about 12.2%. The content diversity and emotional accuracy of the generated responses are improved by learning emotion and semantic features respectively. [...]the average attention mechanism is adopted to better extract semantic features at the sequence level, and the semi-supervised attention mechanism is used in the decoding step to strengthen the fusion of emotional features of the model. [...]with the minimum phase model, the predicted amplitude spectrum and ITDs were used to obtain a set of individual head-related impulse responses. Besides the separate training of the HRTF amplitude and ITD generation models, their joint training was also considered and evaluated.
ISSN:	2468-2322 2468-2322
DOI:	10.1049/cit2.12250