An adaptive and real-time based architecture for financial data integration

In this paper we are proposing an adaptive and real-time approach to resolve real-time financial data integration latency problems and semantic heterogeneity. Due to constraints that we have faced in some projects that requires real-time massive financial data integration and analysis, we decided to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Big Data 2019-11, Vol.6 (1), p.1-25, Article 97
Hauptverfasser: Fikri, Noussair, Rida, Mohamed, Abghour, Noureddine, Moussaid, Khalid, El Omri, Amina
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper we are proposing an adaptive and real-time approach to resolve real-time financial data integration latency problems and semantic heterogeneity. Due to constraints that we have faced in some projects that requires real-time massive financial data integration and analysis, we decided to follow a new approach by combining a hybrid financial ontology, resilient distributed datasets and real-time discretized stream. We create a real-time data integration pipeline to avoid all problems of classic Extract-Transform-Load tools, which are data processing latency, functional miscomprehensions and metadata heterogeneity. This approach is considered as contribution to enhance reporting quality and availability in short time frames, the reason of the use of Apache Spark. We studied Extract-Transform-Load (ETL) concepts, data warehousing fundamentals, big data processing technics and oriented containers clustering architecture, in order to replace the classic data integration and analysis process by our new concept resilient distributed DataStream for online analytical process (RDD4OLAP) cubes which are consumed by using Spark SQL or Spark Core basics.
ISSN:2196-1115
2196-1115
DOI:10.1186/s40537-019-0260-x