Analyzing data quality issues in research information systems via data profiling

•The paper presents methods of data profiling in order to gain an overview of the quality of the data in the data sources before their integration into the research information system.•With the help of data profiling, the institutions can evaluate their research information and provide information a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of information management 2018-08, Vol.41, p.50-56
Hauptverfasser: Azeroual, Otmane, Saake, Gunter, Schallehn, Eike
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•The paper presents methods of data profiling in order to gain an overview of the quality of the data in the data sources before their integration into the research information system.•With the help of data profiling, the institutions can evaluate their research information and provide information about their quality, and also examine the data errors and correct them within their research information system.•The methods of data profiling can reduce project costs and minimize the time spent in institutions, for example for tracing unknown data stocks and identifying causes of quality problems.•Data profiling is considered an important component in improving data quality in research information systems. The success or failure of a RIS in a scientific institution is largely related to the quality of the data available as a basis for the RIS applications. The most beautiful Business Intelligence (BI) tools (reporting, etc.) are worthless when displaying incorrect, incomplete, or inconsistent data. An integral part of every RIS is thus the integration of data from the operative systems. Before starting the integration process (ETL) of a source system, a rich analysis of source data is required. With the support of a data quality check, causes of quality problems can usually be detected. Corresponding analyzes are performed with data profiling to provide a good picture of the state of the data. In this paper, methods of data profiling are presented in order to gain an overview of the quality of the data in the source systems before their integration into the RIS. With the help of data profiling, the scientific institutions can not only evaluate their research information and provide information about their quality, but also examine the dependencies and redundancies between data fields and better correct them within their RIS.
ISSN:0268-4012
1873-4707
DOI:10.1016/j.ijinfomgt.2018.02.007