Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms

In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal con...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2023-12, Vol.79 (18), p.20235-20262
Hauptverfasser: Wen, Xiaoqiang, Wu, Zhibin, Zhou, Mengchong, Wang, Jianguo, Wu, Lifeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 20262
container_issue 18
container_start_page 20235
container_title The Journal of supercomputing
container_volume 79
creator Wen, Xiaoqiang
Wu, Zhibin
Zhou, Mengchong
Wang, Jianguo
Wu, Lifeng
description In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.
doi_str_mv 10.1007/s11227-023-05443-5
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2879581261</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2879581261</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</originalsourceid><addsrcrecordid>eNp9UDtPwzAQthBIlMIfYLLEbDg_EicjqgpFqsQAzJaT2G3axg52SsW_xyWV2JjuTt_j7j6EbincUwD5ECllTBJgnEAmBCfZGZrQTKZRFOIcTaBkQIpMsEt0FeMGAASXfII289o737U17lrXuhX2Fg9rEzq9w70_mID7nXYDrnQ0DfYOt10f_FfqF7rxvicjYIPuzMGHLdauwW-9DtsToncrH9ph3cVrdGH1LpqbU52ij6f5-2xBlq_PL7PHJamZhIFYmgujm6rUBZR1YWxelJY1FJrG1tRkIhdaWyYl8IKyHNJnpWRcQg1VySvKp-hu9E2Hfu5NHNTG74NLKxUrZJkdVUcWG1l18DEGY1Uf2k6Hb0VBHTNVY6YqZap-M1VZEvFRFBPZrUz4s_5H9QMIQ3oj</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2879581261</pqid></control><display><type>article</type><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><source>Springer Nature - Complete Springer Journals</source><creator>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</creator><creatorcontrib>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</creatorcontrib><description>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-023-05443-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Big Data ; Compilers ; Computer Science ; Computing time ; Data mining ; Interpreters ; Mathematical analysis ; Optimization ; Parameters ; Processor Architectures ; Programming Languages ; Thermal power plants ; Thermoelectricity</subject><ispartof>The Journal of supercomputing, 2023-12, Vol.79 (18), p.20235-20262</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-023-05443-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-023-05443-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Wen, Xiaoqiang</creatorcontrib><creatorcontrib>Wu, Zhibin</creatorcontrib><creatorcontrib>Zhou, Mengchong</creatorcontrib><creatorcontrib>Wang, Jianguo</creatorcontrib><creatorcontrib>Wu, Lifeng</creatorcontrib><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</description><subject>Algorithms</subject><subject>Big Data</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Computing time</subject><subject>Data mining</subject><subject>Interpreters</subject><subject>Mathematical analysis</subject><subject>Optimization</subject><subject>Parameters</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>Thermal power plants</subject><subject>Thermoelectricity</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9UDtPwzAQthBIlMIfYLLEbDg_EicjqgpFqsQAzJaT2G3axg52SsW_xyWV2JjuTt_j7j6EbincUwD5ECllTBJgnEAmBCfZGZrQTKZRFOIcTaBkQIpMsEt0FeMGAASXfII289o737U17lrXuhX2Fg9rEzq9w70_mID7nXYDrnQ0DfYOt10f_FfqF7rxvicjYIPuzMGHLdauwW-9DtsToncrH9ph3cVrdGH1LpqbU52ij6f5-2xBlq_PL7PHJamZhIFYmgujm6rUBZR1YWxelJY1FJrG1tRkIhdaWyYl8IKyHNJnpWRcQg1VySvKp-hu9E2Hfu5NHNTG74NLKxUrZJkdVUcWG1l18DEGY1Uf2k6Hb0VBHTNVY6YqZap-M1VZEvFRFBPZrUz4s_5H9QMIQ3oj</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Wen, Xiaoqiang</creator><creator>Wu, Zhibin</creator><creator>Zhou, Mengchong</creator><creator>Wang, Jianguo</creator><creator>Wu, Lifeng</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20231201</creationdate><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><author>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Big Data</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Computing time</topic><topic>Data mining</topic><topic>Interpreters</topic><topic>Mathematical analysis</topic><topic>Optimization</topic><topic>Parameters</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>Thermal power plants</topic><topic>Thermoelectricity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wen, Xiaoqiang</creatorcontrib><creatorcontrib>Wu, Zhibin</creatorcontrib><creatorcontrib>Zhou, Mengchong</creatorcontrib><creatorcontrib>Wang, Jianguo</creatorcontrib><creatorcontrib>Wu, Lifeng</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wen, Xiaoqiang</au><au>Wu, Zhibin</au><au>Zhou, Mengchong</au><au>Wang, Jianguo</au><au>Wu, Lifeng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>79</volume><issue>18</issue><spage>20235</spage><epage>20262</epage><pages>20235-20262</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-023-05443-5</doi><tpages>28</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0920-8542
ispartof The Journal of supercomputing, 2023-12, Vol.79 (18), p.20235-20262
issn 0920-8542
1573-0484
language eng
recordid cdi_proquest_journals_2879581261
source Springer Nature - Complete Springer Journals
subjects Algorithms
Big Data
Compilers
Computer Science
Computing time
Data mining
Interpreters
Mathematical analysis
Optimization
Parameters
Processor Architectures
Programming Languages
Thermal power plants
Thermoelectricity
title Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T18%3A30%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Economic%20mining%20of%20thermal%20power%20plant%20based%20on%20improved%20Hadoop-based%20framework%20and%20Spark-based%20algorithms&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Wen,%20Xiaoqiang&rft.date=2023-12-01&rft.volume=79&rft.issue=18&rft.spage=20235&rft.epage=20262&rft.pages=20235-20262&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-023-05443-5&rft_dat=%3Cproquest_cross%3E2879581261%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2879581261&rft_id=info:pmid/&rfr_iscdi=true