Vision-knowledge fusion model for multi-domain medical report generation

Medical report generation with knowledge graph is an essential task in the medical field. Although the existing knowledge graphs have many entities, their semantics are not sufficient due to the challenge of uniformly extracting and fusing the expert knowledge from different diseases. Therefore, it...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Information fusion 2023-09, Vol.97, p.101817, Article 101817
Hauptverfasser:	Xu, Dexuan, Zhu, Huashi, Huang, Yu, Jin, Zhi, Ding, Weiping, Li, Hang, Ran, Menglong
Format:	Artikel
Sprache:	eng
Schlagworte:	Graph neural network Knowledge graph Medical report generation Multi-modal fusion
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Medical report generation with knowledge graph is an essential task in the medical field. Although the existing knowledge graphs have many entities, their semantics are not sufficient due to the challenge of uniformly extracting and fusing the expert knowledge from different diseases. Therefore, it is necessary to automatically construct specific knowledge graph. In this paper, we propose a vision-knowledge fusion model based on medical images and knowledge graphs to fully utilize high-quality data from different diseases and languages. Firstly, we give a general method to automatically construct every domain knowledge graph based on medical standards. Secondly, we design a knowledge-based attention mechanism to effectively fuse image and knowledge. Then, we build a triples restoration module to obtain fine-grained knowledge, and the knowledge-based evaluation metrics are first proposed which are more reasonable and measurable from different dimensions. Finally, we conduct experiments to verify the effectiveness of our model on two different diseases datasets: the IU-Xray chest radiograph public dataset and the NCRC-DS dataset of Chinese dermoscopy reports we compiled. Our model outperforms previous benchmark methods and achieves excellent evaluation scores on both datasets. Additionally, interpretability and clinical usefulness of the model are validated and our method can be generalized to multiple domains and different diseases. •An automatic method to construct domain knowledge graph based on medical standards.•New knowledge-based attention for image and graph.•A vision-knowledge fusion model with triples restoration.•A more reasonable knowledge-based metrics for report generation.•Well performance on two datasets from different domains.
ISSN:	1566-2535 1872-6305
DOI:	10.1016/j.inffus.2023.101817