Incorporating multivariate semantic association graphs into multimodal networks for information extraction from documents
Documents contain abundant information available for managerial decision-making. However, manual methods of screening document information lack accuracy due to the heterogeneity of documents. To address the above issue, we propose a multimodal network combining multivariate semantic association grap...
Gespeichert in:
Veröffentlicht in: | The Journal of supercomputing 2024, Vol.80 (13), p.18705-18727 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Documents contain abundant information available for managerial decision-making. However, manual methods of screening document information lack accuracy due to the heterogeneity of documents. To address the above issue, we propose a multimodal network combining multivariate semantic association graphs, MMIE, for accurately extracting information from documents. Firstly, the multivariate semantic graphs between multimodal data within each modality are constructed based on the semantic association of text contents, followed by the semantic relationships in the graphs to lead the fusion and embedding of the extracted multimodal data and improve the feature representation capability. Subsequently, the semantically linked multimodal information is fed into the newly constructed multimodal self-attention module to better establish inter-modal associations. Finally, a supervised comparison learning loss function is employed to reduce further the information loss due to sample imbalance. The experimental results on three real datasets show that the proposed model can extract feature information of different modal data more accurately, and the F1 scores reach 87.28
%
, 82.53
%
, and 81.17
%
, respectively. |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-024-06174-x |