Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms
In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal con...
Gespeichert in:
Veröffentlicht in: | The Journal of supercomputing 2023-12, Vol.79 (18), p.20235-20262 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 20262 |
---|---|
container_issue | 18 |
container_start_page | 20235 |
container_title | The Journal of supercomputing |
container_volume | 79 |
creator | Wen, Xiaoqiang Wu, Zhibin Zhou, Mengchong Wang, Jianguo Wu, Lifeng |
description | In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit. |
doi_str_mv | 10.1007/s11227-023-05443-5 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2879581261</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2879581261</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</originalsourceid><addsrcrecordid>eNp9UDtPwzAQthBIlMIfYLLEbDg_EicjqgpFqsQAzJaT2G3axg52SsW_xyWV2JjuTt_j7j6EbincUwD5ECllTBJgnEAmBCfZGZrQTKZRFOIcTaBkQIpMsEt0FeMGAASXfII289o737U17lrXuhX2Fg9rEzq9w70_mID7nXYDrnQ0DfYOt10f_FfqF7rxvicjYIPuzMGHLdauwW-9DtsToncrH9ph3cVrdGH1LpqbU52ij6f5-2xBlq_PL7PHJamZhIFYmgujm6rUBZR1YWxelJY1FJrG1tRkIhdaWyYl8IKyHNJnpWRcQg1VySvKp-hu9E2Hfu5NHNTG74NLKxUrZJkdVUcWG1l18DEGY1Uf2k6Hb0VBHTNVY6YqZap-M1VZEvFRFBPZrUz4s_5H9QMIQ3oj</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2879581261</pqid></control><display><type>article</type><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><source>Springer Nature - Complete Springer Journals</source><creator>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</creator><creatorcontrib>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</creatorcontrib><description>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-023-05443-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Big Data ; Compilers ; Computer Science ; Computing time ; Data mining ; Interpreters ; Mathematical analysis ; Optimization ; Parameters ; Processor Architectures ; Programming Languages ; Thermal power plants ; Thermoelectricity</subject><ispartof>The Journal of supercomputing, 2023-12, Vol.79 (18), p.20235-20262</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-023-05443-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-023-05443-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Wen, Xiaoqiang</creatorcontrib><creatorcontrib>Wu, Zhibin</creatorcontrib><creatorcontrib>Zhou, Mengchong</creatorcontrib><creatorcontrib>Wang, Jianguo</creatorcontrib><creatorcontrib>Wu, Lifeng</creatorcontrib><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</description><subject>Algorithms</subject><subject>Big Data</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Computing time</subject><subject>Data mining</subject><subject>Interpreters</subject><subject>Mathematical analysis</subject><subject>Optimization</subject><subject>Parameters</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>Thermal power plants</subject><subject>Thermoelectricity</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9UDtPwzAQthBIlMIfYLLEbDg_EicjqgpFqsQAzJaT2G3axg52SsW_xyWV2JjuTt_j7j6EbincUwD5ECllTBJgnEAmBCfZGZrQTKZRFOIcTaBkQIpMsEt0FeMGAASXfII289o737U17lrXuhX2Fg9rEzq9w70_mID7nXYDrnQ0DfYOt10f_FfqF7rxvicjYIPuzMGHLdauwW-9DtsToncrH9ph3cVrdGH1LpqbU52ij6f5-2xBlq_PL7PHJamZhIFYmgujm6rUBZR1YWxelJY1FJrG1tRkIhdaWyYl8IKyHNJnpWRcQg1VySvKp-hu9E2Hfu5NHNTG74NLKxUrZJkdVUcWG1l18DEGY1Uf2k6Hb0VBHTNVY6YqZap-M1VZEvFRFBPZrUz4s_5H9QMIQ3oj</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Wen, Xiaoqiang</creator><creator>Wu, Zhibin</creator><creator>Zhou, Mengchong</creator><creator>Wang, Jianguo</creator><creator>Wu, Lifeng</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20231201</creationdate><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><author>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Big Data</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Computing time</topic><topic>Data mining</topic><topic>Interpreters</topic><topic>Mathematical analysis</topic><topic>Optimization</topic><topic>Parameters</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>Thermal power plants</topic><topic>Thermoelectricity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wen, Xiaoqiang</creatorcontrib><creatorcontrib>Wu, Zhibin</creatorcontrib><creatorcontrib>Zhou, Mengchong</creatorcontrib><creatorcontrib>Wang, Jianguo</creatorcontrib><creatorcontrib>Wu, Lifeng</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wen, Xiaoqiang</au><au>Wu, Zhibin</au><au>Zhou, Mengchong</au><au>Wang, Jianguo</au><au>Wu, Lifeng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>79</volume><issue>18</issue><spage>20235</spage><epage>20262</epage><pages>20235-20262</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-023-05443-5</doi><tpages>28</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0920-8542 |
ispartof | The Journal of supercomputing, 2023-12, Vol.79 (18), p.20235-20262 |
issn | 0920-8542 1573-0484 |
language | eng |
recordid | cdi_proquest_journals_2879581261 |
source | Springer Nature - Complete Springer Journals |
subjects | Algorithms Big Data Compilers Computer Science Computing time Data mining Interpreters Mathematical analysis Optimization Parameters Processor Architectures Programming Languages Thermal power plants Thermoelectricity |
title | Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T18%3A30%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Economic%20mining%20of%20thermal%20power%20plant%20based%20on%20improved%20Hadoop-based%20framework%20and%20Spark-based%20algorithms&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Wen,%20Xiaoqiang&rft.date=2023-12-01&rft.volume=79&rft.issue=18&rft.spage=20235&rft.epage=20262&rft.pages=20235-20262&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-023-05443-5&rft_dat=%3Cproquest_cross%3E2879581261%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2879581261&rft_id=info:pmid/&rfr_iscdi=true |