Economic mining of thermal power plant based on improved Hadoop-based framework and Spark-based algorithms

In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal con...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing 2023-12, Vol.79 (18), p.20235-20262
Hauptverfasser: Wen, Xiaoqiang, Wu, Zhibin, Zhou, Mengchong, Wang, Jianguo, Wu, Lifeng
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In order to explore potential value of explosively growing data in thermal power unit, this paper proposes a big data mining method based on Hadoop-based Spark cluster-computing framework and algorithms. Firstly, positive and negative balance methods are used to accurately obtain actual net coal consumption, and maximum information coefficient method is used to select all parameters related to optimization objectives. Then, Spark-based Mini-Batch K-means algorithm and Elbow method are constructed to divide whole operating modes. After that, all data are discretized and mapped to corresponding intervals by using Spark-based Elbow method and Mini-Batch K-means algorithm. Finally, Spark-based parallel FP-growth algorithm is used to deeply mine the potential relationships and laws. To verify the proposed method, a 350-MW thermal power unit is taken as a study case. The important conclusions are as follows: (1) the proposed Spark-based Mini-Batch K-means algorithm reduces the calculation time by 57.11% compared with Mini-Batch K-means algorithm, and 85.61% calculation time compared with K-means algorithm. The proposed Spark-based FP-growth algorithm reduces computational time by 32.8% compared with FP-growth algorithm. (2) Strong association rules of whole operating modes are mined, and operating optimization guidance schemes for important parameters are obtained. Take operating mode 1 as an example: if the optimal result can be reasonably applied, it can save 2.942 g coal per kilowatt hour. (3) Besides, we have found out some other potential relationships among parameters, which have important reference value for on-site operators to analyze economy of the thermal power unit.
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-023-05443-5