AI-Driven Frameworks for Enhancing Data Quality in Big Data Ecosystems: Error_Detection, Correction, and Metadata Integration

The widespread adoption of big data has ushered in a new era of data-driven decision-making, transforming numerous industries and sectors. However, the efficacy of these decisions hinges on the quality of the underlying data. Poor data quality can result in inaccurate analyses and deceptive conclusi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org 2024-05
1. Verfasser: Elouataoui, Widad
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Elouataoui, Widad
description The widespread adoption of big data has ushered in a new era of data-driven decision-making, transforming numerous industries and sectors. However, the efficacy of these decisions hinges on the quality of the underlying data. Poor data quality can result in inaccurate analyses and deceptive conclusions. Managing the vast volume, velocity, and variety of data sources presents significant challenges, heightening the importance of addressing big data quality issues. While there has been increased attention from both academia and industry, current approaches often lack comprehensiveness and universality. They tend to focus on limited metrics, neglecting other dimensions of data quality. Moreover, existing methods are often context-specific, limiting their applicability across different domains. There is a clear need for intelligent, automated approaches leveraging artificial intelligence (AI) for advanced data quality corrections. To bridge these gaps, this Ph.D. thesis proposes a novel set of interconnected frameworks aimed at enhancing big data quality comprehensively. Firstly, we introduce new quality metrics and a weighted scoring system for precise data quality assessment. Secondly, we present a generic framework for detecting various quality anomalies using AI models. Thirdly, we propose an innovative framework for correcting detected anomalies through predictive modeling. Additionally, we address metadata quality enhancement within big data ecosystems. These frameworks are rigorously tested on diverse datasets, demonstrating their efficacy in improving big data quality. Finally, the thesis concludes with insights and suggestions for future research directions.
format Article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3052222229</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3052222229</sourcerecordid><originalsourceid>FETCH-proquest_journals_30522222293</originalsourceid><addsrcrecordid>eNqNi0FrwkAUhBdBUKr_4YFXA-mu0epNTYIePBR6D4_4tKvJW327afHQ_14t6d25zDDfTEf1tTGv0dtE654aen-K41hPZzpJTF_9LLdRKvaLGHLBmr6dnD0cnEDGn8il5SOkGBDeG6xsuIFlWNm2y0rnbz5Q7ReQiTgpUgpUBut4DGsn8p-R97CjgPvHa8uBjoIPMlDdA1aehq2_qFGefaw30UXctSEfipNrhO-oMHGi_zQ3z61-AV9qTo0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3052222229</pqid></control><display><type>article</type><title>AI-Driven Frameworks for Enhancing Data Quality in Big Data Ecosystems: Error_Detection, Correction, and Metadata Integration</title><source>Free E- Journals</source><creator>Elouataoui, Widad</creator><creatorcontrib>Elouataoui, Widad</creatorcontrib><description>The widespread adoption of big data has ushered in a new era of data-driven decision-making, transforming numerous industries and sectors. However, the efficacy of these decisions hinges on the quality of the underlying data. Poor data quality can result in inaccurate analyses and deceptive conclusions. Managing the vast volume, velocity, and variety of data sources presents significant challenges, heightening the importance of addressing big data quality issues. While there has been increased attention from both academia and industry, current approaches often lack comprehensiveness and universality. They tend to focus on limited metrics, neglecting other dimensions of data quality. Moreover, existing methods are often context-specific, limiting their applicability across different domains. There is a clear need for intelligent, automated approaches leveraging artificial intelligence (AI) for advanced data quality corrections. To bridge these gaps, this Ph.D. thesis proposes a novel set of interconnected frameworks aimed at enhancing big data quality comprehensively. Firstly, we introduce new quality metrics and a weighted scoring system for precise data quality assessment. Secondly, we present a generic framework for detecting various quality anomalies using AI models. Thirdly, we propose an innovative framework for correcting detected anomalies through predictive modeling. Additionally, we address metadata quality enhancement within big data ecosystems. These frameworks are rigorously tested on diverse datasets, demonstrating their efficacy in improving big data quality. Finally, the thesis concludes with insights and suggestions for future research directions.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Anomalies ; Artificial intelligence ; Big Data ; Effectiveness ; Error correction ; Error detection ; Metadata ; Prediction models ; Quality assessment</subject><ispartof>arXiv.org, 2024-05</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>780,784</link.rule.ids></links><search><creatorcontrib>Elouataoui, Widad</creatorcontrib><title>AI-Driven Frameworks for Enhancing Data Quality in Big Data Ecosystems: Error_Detection, Correction, and Metadata Integration</title><title>arXiv.org</title><description>The widespread adoption of big data has ushered in a new era of data-driven decision-making, transforming numerous industries and sectors. However, the efficacy of these decisions hinges on the quality of the underlying data. Poor data quality can result in inaccurate analyses and deceptive conclusions. Managing the vast volume, velocity, and variety of data sources presents significant challenges, heightening the importance of addressing big data quality issues. While there has been increased attention from both academia and industry, current approaches often lack comprehensiveness and universality. They tend to focus on limited metrics, neglecting other dimensions of data quality. Moreover, existing methods are often context-specific, limiting their applicability across different domains. There is a clear need for intelligent, automated approaches leveraging artificial intelligence (AI) for advanced data quality corrections. To bridge these gaps, this Ph.D. thesis proposes a novel set of interconnected frameworks aimed at enhancing big data quality comprehensively. Firstly, we introduce new quality metrics and a weighted scoring system for precise data quality assessment. Secondly, we present a generic framework for detecting various quality anomalies using AI models. Thirdly, we propose an innovative framework for correcting detected anomalies through predictive modeling. Additionally, we address metadata quality enhancement within big data ecosystems. These frameworks are rigorously tested on diverse datasets, demonstrating their efficacy in improving big data quality. Finally, the thesis concludes with insights and suggestions for future research directions.</description><subject>Anomalies</subject><subject>Artificial intelligence</subject><subject>Big Data</subject><subject>Effectiveness</subject><subject>Error correction</subject><subject>Error detection</subject><subject>Metadata</subject><subject>Prediction models</subject><subject>Quality assessment</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><recordid>eNqNi0FrwkAUhBdBUKr_4YFXA-mu0epNTYIePBR6D4_4tKvJW327afHQ_14t6d25zDDfTEf1tTGv0dtE654aen-K41hPZzpJTF_9LLdRKvaLGHLBmr6dnD0cnEDGn8il5SOkGBDeG6xsuIFlWNm2y0rnbz5Q7ReQiTgpUgpUBut4DGsn8p-R97CjgPvHa8uBjoIPMlDdA1aehq2_qFGefaw30UXctSEfipNrhO-oMHGi_zQ3z61-AV9qTo0</recordid><startdate>20240506</startdate><enddate>20240506</enddate><creator>Elouataoui, Widad</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240506</creationdate><title>AI-Driven Frameworks for Enhancing Data Quality in Big Data Ecosystems: Error_Detection, Correction, and Metadata Integration</title><author>Elouataoui, Widad</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_30522222293</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Anomalies</topic><topic>Artificial intelligence</topic><topic>Big Data</topic><topic>Effectiveness</topic><topic>Error correction</topic><topic>Error detection</topic><topic>Metadata</topic><topic>Prediction models</topic><topic>Quality assessment</topic><toplevel>online_resources</toplevel><creatorcontrib>Elouataoui, Widad</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Elouataoui, Widad</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>AI-Driven Frameworks for Enhancing Data Quality in Big Data Ecosystems: Error_Detection, Correction, and Metadata Integration</atitle><jtitle>arXiv.org</jtitle><date>2024-05-06</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>The widespread adoption of big data has ushered in a new era of data-driven decision-making, transforming numerous industries and sectors. However, the efficacy of these decisions hinges on the quality of the underlying data. Poor data quality can result in inaccurate analyses and deceptive conclusions. Managing the vast volume, velocity, and variety of data sources presents significant challenges, heightening the importance of addressing big data quality issues. While there has been increased attention from both academia and industry, current approaches often lack comprehensiveness and universality. They tend to focus on limited metrics, neglecting other dimensions of data quality. Moreover, existing methods are often context-specific, limiting their applicability across different domains. There is a clear need for intelligent, automated approaches leveraging artificial intelligence (AI) for advanced data quality corrections. To bridge these gaps, this Ph.D. thesis proposes a novel set of interconnected frameworks aimed at enhancing big data quality comprehensively. Firstly, we introduce new quality metrics and a weighted scoring system for precise data quality assessment. Secondly, we present a generic framework for detecting various quality anomalies using AI models. Thirdly, we propose an innovative framework for correcting detected anomalies through predictive modeling. Additionally, we address metadata quality enhancement within big data ecosystems. These frameworks are rigorously tested on diverse datasets, demonstrating their efficacy in improving big data quality. Finally, the thesis concludes with insights and suggestions for future research directions.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-05
issn 2331-8422
language eng
recordid cdi_proquest_journals_3052222229
source Free E- Journals
subjects Anomalies
Artificial intelligence
Big Data
Effectiveness
Error correction
Error detection
Metadata
Prediction models
Quality assessment
title AI-Driven Frameworks for Enhancing Data Quality in Big Data Ecosystems: Error_Detection, Correction, and Metadata Integration
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T22%3A19%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=AI-Driven%20Frameworks%20for%20Enhancing%20Data%20Quality%20in%20Big%20Data%20Ecosystems:%20Error_Detection,%20Correction,%20and%20Metadata%20Integration&rft.jtitle=arXiv.org&rft.au=Elouataoui,%20Widad&rft.date=2024-05-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3052222229%3C/proquest%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3052222229&rft_id=info:pmid/&rfr_iscdi=true