MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction

With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selection methods still fac...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:BMC bioinformatics 2023-10, Vol.24 (1), p.1-384, Article 384
Hauptverfasser: Zhou, Ganghui, Gao, Jing, Zuo, Dongshi, Li, Jin, Li, Rui
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selection methods still face the problem of poor prediction accuracy in practical applications. This paper proposes a genome prediction method MSXFGP based on a multi-strategy improved sparrow search algorithm (SSA) to optimize XGBoost parameters and feature selection. Firstly, logistic chaos mapping, elite learning, adaptive parameter adjustment, Levy flight, and an early stop strategy are incorporated into the SSA. This integration serves to enhance the global and local search capabilities of the algorithm, thereby improving its convergence accuracy and stability. Subsequently, the improved SSA is utilized to concurrently optimize XGBoost parameters and feature selection, leading to the establishment of a new genomic selection method, MSXFGP. Utilizing both the coefficient of determination R.sup.2 and the Pearson correlation coefficient as evaluation metrics, MSXFGP was evaluated against six existing genomic selection models across six datasets. The findings reveal that MSXFGP prediction accuracy is comparable or better than existing widely used genomic selection methods, and it exhibits better accuracy when R.sup.2 is utilized as an assessment metric. Additionally, this research provides a user-friendly Python utility designed to aid breeders in the effective application of this innovative method. MSXFGP is accessible at https://github.com/DIBreeding/MSXFGP. The experimental results show that the prediction accuracy of MSXFGP is comparable or better than existing genome selection methods, providing a new approach for plant genome selection.
ISSN:1471-2105
1471-2105
DOI:10.1186/s12859-023-05514-7