Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms

In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal con...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of supercomputing 2023-12, Vol.79 (18), p.20235-20262
Hauptverfasser:	Wen, Xiaoqiang, Wu, Zhibin, Zhou, Mengchong, Wang, Jianguo, Wu, Lifeng
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Big Data Compilers Computer Science Computing time Data mining Interpreters Mathematical analysis Optimization Parameters Processor Architectures Programming Languages Thermal power plants Thermoelectricity
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	20262
container_issue	18
container_start_page	20235
container_title	The Journal of supercomputing
container_volume	79
creator	Wen, Xiaoqiang Wu, Zhibin Zhou, Mengchong Wang, Jianguo Wu, Lifeng
description	In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.
doi_str_mv	10.1007/s11227-023-05443-5
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2879581261</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2879581261</sourcerecordid><originalsourceid>FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</originalsourceid><addsrcrecordid>eNp9UDtPwzAQthBIlMIfYLLEbDg_EicjqgpFqsQAzJaT2G3axg52SsW_xyWV2JjuTt_j7j6EbincUwD5ECllTBJgnEAmBCfZGZrQTKZRFOIcTaBkQIpMsEt0FeMGAASXfII289o737U17lrXuhX2Fg9rEzq9w70_mID7nXYDrnQ0DfYOt10f_FfqF7rxvicjYIPuzMGHLdauwW-9DtsToncrH9ph3cVrdGH1LpqbU52ij6f5-2xBlq_PL7PHJamZhIFYmgujm6rUBZR1YWxelJY1FJrG1tRkIhdaWyYl8IKyHNJnpWRcQg1VySvKp-hu9E2Hfu5NHNTG74NLKxUrZJkdVUcWG1l18DEGY1Uf2k6Hb0VBHTNVY6YqZap-M1VZEvFRFBPZrUz4s_5H9QMIQ3oj</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2879581261</pqid></control><display><type>article</type><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><source>Springer Nature - Complete Springer Journals</source><creator>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</creator><creatorcontrib>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</creatorcontrib><description>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</description><identifier>ISSN: 0920-8542</identifier><identifier>EISSN: 1573-0484</identifier><identifier>DOI: 10.1007/s11227-023-05443-5</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Algorithms ; Big Data ; Compilers ; Computer Science ; Computing time ; Data mining ; Interpreters ; Mathematical analysis ; Optimization ; Parameters ; Processor Architectures ; Programming Languages ; Thermal power plants ; Thermoelectricity</subject><ispartof>The Journal of supercomputing, 2023-12, Vol.79 (18), p.20235-20262</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/s11227-023-05443-5$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/s11227-023-05443-5$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>314,776,780,27903,27904,41467,42536,51297</link.rule.ids></links><search><creatorcontrib>Wen, Xiaoqiang</creatorcontrib><creatorcontrib>Wu, Zhibin</creatorcontrib><creatorcontrib>Zhou, Mengchong</creatorcontrib><creatorcontrib>Wang, Jianguo</creatorcontrib><creatorcontrib>Wu, Lifeng</creatorcontrib><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><title>The Journal of supercomputing</title><addtitle>J Supercomput</addtitle><description>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</description><subject>Algorithms</subject><subject>Big Data</subject><subject>Compilers</subject><subject>Computer Science</subject><subject>Computing time</subject><subject>Data mining</subject><subject>Interpreters</subject><subject>Mathematical analysis</subject><subject>Optimization</subject><subject>Parameters</subject><subject>Processor Architectures</subject><subject>Programming Languages</subject><subject>Thermal power plants</subject><subject>Thermoelectricity</subject><issn>0920-8542</issn><issn>1573-0484</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNp9UDtPwzAQthBIlMIfYLLEbDg_EicjqgpFqsQAzJaT2G3axg52SsW_xyWV2JjuTt_j7j6EbincUwD5ECllTBJgnEAmBCfZGZrQTKZRFOIcTaBkQIpMsEt0FeMGAASXfII289o737U17lrXuhX2Fg9rEzq9w70_mID7nXYDrnQ0DfYOt10f_FfqF7rxvicjYIPuzMGHLdauwW-9DtsToncrH9ph3cVrdGH1LpqbU52ij6f5-2xBlq_PL7PHJamZhIFYmgujm6rUBZR1YWxelJY1FJrG1tRkIhdaWyYl8IKyHNJnpWRcQg1VySvKp-hu9E2Hfu5NHNTG74NLKxUrZJkdVUcWG1l18DEGY1Uf2k6Hb0VBHTNVY6YqZap-M1VZEvFRFBPZrUz4s_5H9QMIQ3oj</recordid><startdate>20231201</startdate><enddate>20231201</enddate><creator>Wen, Xiaoqiang</creator><creator>Wu, Zhibin</creator><creator>Zhou, Mengchong</creator><creator>Wang, Jianguo</creator><creator>Wu, Lifeng</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20231201</creationdate><title>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</title><author>Wen, Xiaoqiang ; Wu, Zhibin ; Zhou, Mengchong ; Wang, Jianguo ; Wu, Lifeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c270t-f164eadb9a809c8ef689f2d10ddfc1e5464aaf2770381260092972370c0b93b13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Big Data</topic><topic>Compilers</topic><topic>Computer Science</topic><topic>Computing time</topic><topic>Data mining</topic><topic>Interpreters</topic><topic>Mathematical analysis</topic><topic>Optimization</topic><topic>Parameters</topic><topic>Processor Architectures</topic><topic>Programming Languages</topic><topic>Thermal power plants</topic><topic>Thermoelectricity</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wen, Xiaoqiang</creatorcontrib><creatorcontrib>Wu, Zhibin</creatorcontrib><creatorcontrib>Zhou, Mengchong</creatorcontrib><creatorcontrib>Wang, Jianguo</creatorcontrib><creatorcontrib>Wu, Lifeng</creatorcontrib><collection>CrossRef</collection><jtitle>The Journal of supercomputing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wen, Xiaoqiang</au><au>Wu, Zhibin</au><au>Zhou, Mengchong</au><au>Wang, Jianguo</au><au>Wu, Lifeng</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms</atitle><jtitle>The Journal of supercomputing</jtitle><stitle>J Supercomput</stitle><date>2023-12-01</date><risdate>2023</risdate><volume>79</volume><issue>18</issue><spage>20235</spage><epage>20262</epage><pages>20235-20262</pages><issn>0920-8542</issn><eissn>1573-0484</eissn><abstract>In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s11227-023-05443-5</doi><tpages>28</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 0920-8542
ispartof	The Journal of supercomputing, 2023-12, Vol.79 (18), p.20235-20262
issn	0920-8542 1573-0484
language	eng
recordid	cdi_proquest_journals_2879581261
source	Springer Nature - Complete Springer Journals
subjects	Algorithms Big Data Compilers Computer Science Computing time Data mining Interpreters Mathematical analysis Optimization Parameters Processor Architectures Programming Languages Thermal power plants Thermoelectricity
title	Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-26T18%3A30%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Economic%20mining%20of%20thermal%20power%20plant%20based%20on%20improved%20Hadoop-based%20framework%20and%20Spark-based%20algorithms&rft.jtitle=The%20Journal%20of%20supercomputing&rft.au=Wen,%20Xiaoqiang&rft.date=2023-12-01&rft.volume=79&rft.issue=18&rft.spage=20235&rft.epage=20262&rft.pages=20235-20262&rft.issn=0920-8542&rft.eissn=1573-0484&rft_id=info:doi/10.1007/s11227-023-05443-5&rft_dat=%3Cproquest_cross%3E2879581261%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2879581261&rft_id=info:pmid/&rfr_iscdi=true