S3Mining: A model-driven engineering approach for supporting novice data miners in selecting suitable classifiers

•S3Mining framework for supporting novice data miners is proposed.•Model-driven engineering and scientific workflow standards are used by S3Mining framework.•Know-how of expert data miners is used to recommend novice data miners which algorithms to apply.•Meta-data (meta-features) is used to better...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer standards and interfaces 2019-07, Vol.65, p.143-158
Hauptverfasser: Espinosa, Roberto, García-Saiz, Diego, Zorrilla, Marta, Zubcoff, José Jacobo, Mazón, Jose-Norberto
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 158
container_issue
container_start_page 143
container_title Computer standards and interfaces
container_volume 65
creator Espinosa, Roberto
García-Saiz, Diego
Zorrilla, Marta
Zubcoff, José Jacobo
Mazón, Jose-Norberto
description •S3Mining framework for supporting novice data miners is proposed.•Model-driven engineering and scientific workflow standards are used by S3Mining framework.•Know-how of expert data miners is used to recommend novice data miners which algorithms to apply.•Meta-data (meta-features) is used to better understand the behavior of data mining algorithms.•S3Mining framework is implemented and available online.•An experimental evaluation is conducted using data sources from the educational domain and also from UCI Machine Learning Repository. Data mining has proven to be very useful in order to extract information from data in many different contexts. However, due to the complexity of data mining techniques, it is required the know-how of an expert in this field to select and use them. Actually, adequately applying data mining is out of the reach of novice users which have expertise in their area of work, but lack skills to employ these techniques. In this paper, we use both model-driven engineering and scientific workflow standards and tools in order to develop named S3Mining framework, which supports novice users in the process of selecting the data mining classification algorithm that better fits with their data and goal. To this aim, this selection process uses the past experiences of expert data miners with the application of classification techniques over their own datasets. The contributions of our S3Mining framework are as follows: (i) an approach to create a knowledge base which stores the past experiences of experts users, (ii) a process that provides the expert users with utilities for the construction of classifiers’ recommenders based on the existing knowledge base, (iii) a system that allows novice data miners to use these recommenders for discovering the classifiers that better fit for solving their problem at hand, and (iv) a public implementation of the framework’s workflows. Finally, an experimental evaluation has been conducted to shown the feasibility of our framework.
doi_str_mv 10.1016/j.csi.2019.03.004
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2240138127</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0920548918303325</els_id><sourcerecordid>2240138127</sourcerecordid><originalsourceid>FETCH-LOGICAL-c411t-bf53f134f799cac18bfd1eb340b101140ee6c48a950802d95e5ffde281a8983b3</originalsourceid><addsrcrecordid>eNp9UMtO5DAQtBBIDI8P4GaJc0K3nUwcOCHEYyVWHICz5Tht8CjjBDszEn-Ph9nznlrqququKsYuEEoEXF6tSpt8KQDbEmQJUB2wBapGFA2gOmQLaAUUdaXaY3aS0goAxFI2C_b1Kv_64MPHNb_l67Gnoeij31LgFD58IIoZ42aa4mjsJ3dj5GkzTWOcd_swbr0l3pvZ8HVmx8R94IkGsr942vjZdANxO5iUvPOZccaOnBkSnf-bp-z94f7t7ql4fnn8c3f7XNgKcS46V0uHsnJN21pjUXWuR-pkBV0OjBUQLW2lTFuDAtG3NdXO9SQUGtUq2clTdrm_m61_bSjNejVuYsgvtRAVoFQomszCPcvGMaVITk_Rr0381gh616xe6dys3jWrQercbNbc7DWU7W9zJp2sp2Cp9zEH1_3o_6P-ASC0gpk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2240138127</pqid></control><display><type>article</type><title>S3Mining: A model-driven engineering approach for supporting novice data miners in selecting suitable classifiers</title><source>Access via ScienceDirect (Elsevier)</source><creator>Espinosa, Roberto ; García-Saiz, Diego ; Zorrilla, Marta ; Zubcoff, José Jacobo ; Mazón, Jose-Norberto</creator><creatorcontrib>Espinosa, Roberto ; García-Saiz, Diego ; Zorrilla, Marta ; Zubcoff, José Jacobo ; Mazón, Jose-Norberto</creatorcontrib><description>•S3Mining framework for supporting novice data miners is proposed.•Model-driven engineering and scientific workflow standards are used by S3Mining framework.•Know-how of expert data miners is used to recommend novice data miners which algorithms to apply.•Meta-data (meta-features) is used to better understand the behavior of data mining algorithms.•S3Mining framework is implemented and available online.•An experimental evaluation is conducted using data sources from the educational domain and also from UCI Machine Learning Repository. Data mining has proven to be very useful in order to extract information from data in many different contexts. However, due to the complexity of data mining techniques, it is required the know-how of an expert in this field to select and use them. Actually, adequately applying data mining is out of the reach of novice users which have expertise in their area of work, but lack skills to employ these techniques. In this paper, we use both model-driven engineering and scientific workflow standards and tools in order to develop named S3Mining framework, which supports novice users in the process of selecting the data mining classification algorithm that better fits with their data and goal. To this aim, this selection process uses the past experiences of expert data miners with the application of classification techniques over their own datasets. The contributions of our S3Mining framework are as follows: (i) an approach to create a knowledge base which stores the past experiences of experts users, (ii) a process that provides the expert users with utilities for the construction of classifiers’ recommenders based on the existing knowledge base, (iii) a system that allows novice data miners to use these recommenders for discovering the classifiers that better fit for solving their problem at hand, and (iv) a public implementation of the framework’s workflows. Finally, an experimental evaluation has been conducted to shown the feasibility of our framework.</description><identifier>ISSN: 0920-5489</identifier><identifier>EISSN: 1872-7018</identifier><identifier>DOI: 10.1016/j.csi.2019.03.004</identifier><language>eng</language><publisher>Amsterdam: Elsevier B.V</publisher><subject>Algorithms ; Classification ; Classifiers ; Data mining ; Feasibility studies ; Knowledge base ; Knowledge management ; Meta-learning ; Miners ; Model-driven ; Model-driven engineering ; Novice data miners ; Utilities ; Workflow</subject><ispartof>Computer standards and interfaces, 2019-07, Vol.65, p.143-158</ispartof><rights>2019 Elsevier B.V.</rights><rights>Copyright Elsevier BV Jul 2019</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c411t-bf53f134f799cac18bfd1eb340b101140ee6c48a950802d95e5ffde281a8983b3</citedby><cites>FETCH-LOGICAL-c411t-bf53f134f799cac18bfd1eb340b101140ee6c48a950802d95e5ffde281a8983b3</cites><orcidid>0000-0001-7875-4951 ; 0000-0002-0475-8834</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.csi.2019.03.004$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids></links><search><creatorcontrib>Espinosa, Roberto</creatorcontrib><creatorcontrib>García-Saiz, Diego</creatorcontrib><creatorcontrib>Zorrilla, Marta</creatorcontrib><creatorcontrib>Zubcoff, José Jacobo</creatorcontrib><creatorcontrib>Mazón, Jose-Norberto</creatorcontrib><title>S3Mining: A model-driven engineering approach for supporting novice data miners in selecting suitable classifiers</title><title>Computer standards and interfaces</title><description>•S3Mining framework for supporting novice data miners is proposed.•Model-driven engineering and scientific workflow standards are used by S3Mining framework.•Know-how of expert data miners is used to recommend novice data miners which algorithms to apply.•Meta-data (meta-features) is used to better understand the behavior of data mining algorithms.•S3Mining framework is implemented and available online.•An experimental evaluation is conducted using data sources from the educational domain and also from UCI Machine Learning Repository. Data mining has proven to be very useful in order to extract information from data in many different contexts. However, due to the complexity of data mining techniques, it is required the know-how of an expert in this field to select and use them. Actually, adequately applying data mining is out of the reach of novice users which have expertise in their area of work, but lack skills to employ these techniques. In this paper, we use both model-driven engineering and scientific workflow standards and tools in order to develop named S3Mining framework, which supports novice users in the process of selecting the data mining classification algorithm that better fits with their data and goal. To this aim, this selection process uses the past experiences of expert data miners with the application of classification techniques over their own datasets. The contributions of our S3Mining framework are as follows: (i) an approach to create a knowledge base which stores the past experiences of experts users, (ii) a process that provides the expert users with utilities for the construction of classifiers’ recommenders based on the existing knowledge base, (iii) a system that allows novice data miners to use these recommenders for discovering the classifiers that better fit for solving their problem at hand, and (iv) a public implementation of the framework’s workflows. Finally, an experimental evaluation has been conducted to shown the feasibility of our framework.</description><subject>Algorithms</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Data mining</subject><subject>Feasibility studies</subject><subject>Knowledge base</subject><subject>Knowledge management</subject><subject>Meta-learning</subject><subject>Miners</subject><subject>Model-driven</subject><subject>Model-driven engineering</subject><subject>Novice data miners</subject><subject>Utilities</subject><subject>Workflow</subject><issn>0920-5489</issn><issn>1872-7018</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2019</creationdate><recordtype>article</recordtype><recordid>eNp9UMtO5DAQtBBIDI8P4GaJc0K3nUwcOCHEYyVWHICz5Tht8CjjBDszEn-Ph9nznlrqququKsYuEEoEXF6tSpt8KQDbEmQJUB2wBapGFA2gOmQLaAUUdaXaY3aS0goAxFI2C_b1Kv_64MPHNb_l67Gnoeij31LgFD58IIoZ42aa4mjsJ3dj5GkzTWOcd_swbr0l3pvZ8HVmx8R94IkGsr942vjZdANxO5iUvPOZccaOnBkSnf-bp-z94f7t7ql4fnn8c3f7XNgKcS46V0uHsnJN21pjUXWuR-pkBV0OjBUQLW2lTFuDAtG3NdXO9SQUGtUq2clTdrm_m61_bSjNejVuYsgvtRAVoFQomszCPcvGMaVITk_Rr0381gh616xe6dys3jWrQercbNbc7DWU7W9zJp2sp2Cp9zEH1_3o_6P-ASC0gpk</recordid><startdate>20190701</startdate><enddate>20190701</enddate><creator>Espinosa, Roberto</creator><creator>García-Saiz, Diego</creator><creator>Zorrilla, Marta</creator><creator>Zubcoff, José Jacobo</creator><creator>Mazón, Jose-Norberto</creator><general>Elsevier B.V</general><general>Elsevier BV</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0001-7875-4951</orcidid><orcidid>https://orcid.org/0000-0002-0475-8834</orcidid></search><sort><creationdate>20190701</creationdate><title>S3Mining: A model-driven engineering approach for supporting novice data miners in selecting suitable classifiers</title><author>Espinosa, Roberto ; García-Saiz, Diego ; Zorrilla, Marta ; Zubcoff, José Jacobo ; Mazón, Jose-Norberto</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c411t-bf53f134f799cac18bfd1eb340b101140ee6c48a950802d95e5ffde281a8983b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2019</creationdate><topic>Algorithms</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Data mining</topic><topic>Feasibility studies</topic><topic>Knowledge base</topic><topic>Knowledge management</topic><topic>Meta-learning</topic><topic>Miners</topic><topic>Model-driven</topic><topic>Model-driven engineering</topic><topic>Novice data miners</topic><topic>Utilities</topic><topic>Workflow</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Espinosa, Roberto</creatorcontrib><creatorcontrib>García-Saiz, Diego</creatorcontrib><creatorcontrib>Zorrilla, Marta</creatorcontrib><creatorcontrib>Zubcoff, José Jacobo</creatorcontrib><creatorcontrib>Mazón, Jose-Norberto</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Computer standards and interfaces</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Espinosa, Roberto</au><au>García-Saiz, Diego</au><au>Zorrilla, Marta</au><au>Zubcoff, José Jacobo</au><au>Mazón, Jose-Norberto</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>S3Mining: A model-driven engineering approach for supporting novice data miners in selecting suitable classifiers</atitle><jtitle>Computer standards and interfaces</jtitle><date>2019-07-01</date><risdate>2019</risdate><volume>65</volume><spage>143</spage><epage>158</epage><pages>143-158</pages><issn>0920-5489</issn><eissn>1872-7018</eissn><abstract>•S3Mining framework for supporting novice data miners is proposed.•Model-driven engineering and scientific workflow standards are used by S3Mining framework.•Know-how of expert data miners is used to recommend novice data miners which algorithms to apply.•Meta-data (meta-features) is used to better understand the behavior of data mining algorithms.•S3Mining framework is implemented and available online.•An experimental evaluation is conducted using data sources from the educational domain and also from UCI Machine Learning Repository. Data mining has proven to be very useful in order to extract information from data in many different contexts. However, due to the complexity of data mining techniques, it is required the know-how of an expert in this field to select and use them. Actually, adequately applying data mining is out of the reach of novice users which have expertise in their area of work, but lack skills to employ these techniques. In this paper, we use both model-driven engineering and scientific workflow standards and tools in order to develop named S3Mining framework, which supports novice users in the process of selecting the data mining classification algorithm that better fits with their data and goal. To this aim, this selection process uses the past experiences of expert data miners with the application of classification techniques over their own datasets. The contributions of our S3Mining framework are as follows: (i) an approach to create a knowledge base which stores the past experiences of experts users, (ii) a process that provides the expert users with utilities for the construction of classifiers’ recommenders based on the existing knowledge base, (iii) a system that allows novice data miners to use these recommenders for discovering the classifiers that better fit for solving their problem at hand, and (iv) a public implementation of the framework’s workflows. Finally, an experimental evaluation has been conducted to shown the feasibility of our framework.</abstract><cop>Amsterdam</cop><pub>Elsevier B.V</pub><doi>10.1016/j.csi.2019.03.004</doi><tpages>16</tpages><orcidid>https://orcid.org/0000-0001-7875-4951</orcidid><orcidid>https://orcid.org/0000-0002-0475-8834</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0920-5489
ispartof Computer standards and interfaces, 2019-07, Vol.65, p.143-158
issn 0920-5489
1872-7018
language eng
recordid cdi_proquest_journals_2240138127
source Access via ScienceDirect (Elsevier)
subjects Algorithms
Classification
Classifiers
Data mining
Feasibility studies
Knowledge base
Knowledge management
Meta-learning
Miners
Model-driven
Model-driven engineering
Novice data miners
Utilities
Workflow
title S3Mining: A model-driven engineering approach for supporting novice data miners in selecting suitable classifiers
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T06%3A38%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=S3Mining:%20A%20model-driven%20engineering%20approach%20for%20supporting%20novice%20data%20miners%20in%20selecting%20suitable%20classifiers&rft.jtitle=Computer%20standards%20and%20interfaces&rft.au=Espinosa,%20Roberto&rft.date=2019-07-01&rft.volume=65&rft.spage=143&rft.epage=158&rft.pages=143-158&rft.issn=0920-5489&rft.eissn=1872-7018&rft_id=info:doi/10.1016/j.csi.2019.03.004&rft_dat=%3Cproquest_cross%3E2240138127%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2240138127&rft_id=info:pmid/&rft_els_id=S0920548918303325&rfr_iscdi=true