Impact of multi-output and stacking methods on feed efficiency prediction from genotype using machine learning algorithms

Feeding represents the largest economic cost in meat production; therefore, selection to improve traits related to feed efficiency is a goal in most livestock breeding programs. Residual feed intake (RFI), that is, the difference between the actual and the expected feed intake based on animal's...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Journal of animal breeding and genetics (1986) 2023-11, Vol.140 (6), p.638-652
Hauptverfasser:	Mora, Mónica, González, Pablo, Quevedo, José Ramón, Montañés, Elena, Tusell, Llibertat, Bergsma, Rob, Piles, Miriam
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Benchmarks Body weight Economic impact Efficiency Feed conversion Feed efficiency Feeds Genotypes Learning algorithms Livestock Machine learning Meat production Multiple regression models Performance prediction Predictions Single-nucleotide polymorphism Stacking Support vector machines Swine
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Feeding represents the largest economic cost in meat production; therefore, selection to improve traits related to feed efficiency is a goal in most livestock breeding programs. Residual feed intake (RFI), that is, the difference between the actual and the expected feed intake based on animal's requirements, has been used as the selection criteria to improve feed efficiency since it was proposed by Kotch in 1963. In growing pigs, it is computed as the residual of the multiple regression model of daily feed intake (DFI), on average daily gain (ADG), backfat thickness (BFT), and metabolic body weight (MW). Recently, prediction using single-output machine learning algorithms and information from SNPs as predictor variables have been proposed for genomic selection in growing pigs, but like in other species, the prediction quality achieved for RFI has been generally poor. However, it has been suggested that it could be improved through multi-output or stacking methods. For this purpose, four strategies were implemented to predict RFI. Two of them correspond to the computation of RFI in an indirect way using the predicted values of its components obtained from (i) individual (multiple single-output strategy) or (ii) simultaneous predictions (multi-output strategy). The other two correspond to the direct prediction of RFI using (iii) the individual predictions of its components as predictor variables jointly with the genotype (stacking strategy), or (iv) using only the genotypes as predictors of RFI (single-output strategy). The single-output strategy was considered the benchmark. This research aimed to test the former three hypotheses using data recorded from 5828 growing pigs and 45,610 SNPs. For all the strategies two different learning methods were fitted: random forest (RF) and support vector regression (SVR). A nested cross-validation (CV) with an outer 10-folds CV and an inner threefold CV for hyperparameter tuning was implemented to test all strategies. This scheme was repeated using as predictor variables different subsets with an increasing number (from 200 to 3000) of the most informative SNPs identified with RF. Results showed that the highest prediction performance was achieved with 1000 SNPs, although the stability of feature selection was poor (0.13 points out of 1). For all SNP subsets, the benchmark showed the best prediction performance. Using the RF as a learner and the 1000 most informative SNPs as predictors, the mean (SD) of the 10 values ob
ISSN:	0931-2668 1439-0388
DOI:	10.1111/jbg.12815