Work like a doctor: Unifying scan localizer and dynamic generator for automated computed tomography report generation
Computed Tomography Report Generation (CTRG) aims to generate medical reports towards a series of radiological images, which is an advancement of the conventional X-ray report generation (generating one medical description only based on a single X-ray snapshot). Beyond the difficulties faced in the...
Gespeichert in:
Veröffentlicht in: | Expert systems with applications 2024-03, Vol.237, p.121442, Article 121442 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Computed Tomography Report Generation (CTRG) aims to generate medical reports towards a series of radiological images, which is an advancement of the conventional X-ray report generation (generating one medical description only based on a single X-ray snapshot). Beyond the difficulties faced in the traditional task, CTRG requires the model to filter out the lesion regions from sequential scans, producing a fine-grained report that conforms to medical logic and common sense. Limited to available datasets, there are few methods trying to tackle this task. Besides, although densely aggregating sequential features may be beneficial, it introduces extra noise. Moreover, radiology reports are long narratives composed of abnormal descriptions and template sentences, but most studies ignore this hierarchical nature and generate the entire reports uniformly. This paper aims to bridge the gap from three distinct perspectives: first, we develop two large-scale clinical datasets termed CTRG-Brain-263K and CTRG-Chest-548K, which contain 263670 brain CT scans and 548696 chest CT scans with authoritative diagnosis reports, respectively. Second, we design a self-attention-based Scan Localizer (SL) that captures a representation most reflective of the lesion area. And a reconstruction loss is introduced to minimize the distance between focused and original scans. Finally, we propose a Dynamic Generator (DG) that decouples the decoder into abnormal and template branches, with produced proposals dynamically aggregated for the final generation. Experimental results confirm the proposed SL-DG outperforms existing methods, i.e., about +5.2% and +0.4% CIDEr points on CTRG-Brain-263K and CTRG-Chest-548K, respectively.
•We present two large-scale clinical datasets for automated CT report generation.•We propose a Scan Localizer to highlight principal scans.•We introduce a Feature Discrepancy loss to minimize the feature distance.•We design a Dynamic Generator to adaptively generate the target word. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2023.121442 |