Skip to main content
Log in

Phenomic selection in wheat breeding: identification and optimisation of factors influencing prediction accuracy and comparison to genomic selection

  • Original Article
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

Key message

Phenomic selection is a promising alternative or complement to genomic selection in wheat breeding. Models combining spectra from different environments maximise the predictive ability of grain yield and heading date of wheat breeding lines.

Abstract

Phenomic selection (PS) is a recent breeding approach similar to genomic selection (GS) except that genotyping is replaced by near-infrared (NIR) spectroscopy. PS can potentially account for non-additive effects and has the major advantage of being low cost and high throughput. Factors influencing GS predictive abilities have been intensively studied, but little is known about PS. We tested and compared the abilities of PS and GS to predict grain yield and heading date from several datasets of bread wheat lines corresponding to the first or second years of trial evaluation from two breeding companies and one research institute in France. We evaluated several factors affecting PS predictive abilities including the possibility of combining spectra collected in different environments. A simple H-BLUP model predicted both traits with prediction ability from 0.26 to 0.62 and with an efficient computation time. Our results showed that the environments in which lines are grown had a crucial impact on predictive ability based on the spectra acquired and was specific to the trait considered. Models combining NIR spectra from different environments were the best PS models and were at least as accurate as GS in most of the datasets. Furthermore, a GH-BLUP model combining genotyping and NIR spectra was the best model of all (prediction ability from 0.31 to 0.73). We demonstrated also that as for GS, the size and the composition of the training set have a crucial impact on predictive ability. PS could therefore replace or complement GS for efficient wheat breeding programs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig.1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

Download references

Acknowledgements

The authors thank the work in experimental units by INRAE (Clermont-Ferrand, Estrées-Mons, Le Moulon, Rennes), breeders from Agri-Obtentions and Florimond Desprez. The authors are grateful to Agri-Obtentions, Florimond Desprez, and the Association Nationale de la Recherche et de la Technologie (ANRT, grant number 2019/0060) which supported this PhD work. The authors also thank Bastian Alexandre and Rachel Carol (Bioscience Editing, France) for the proofreading of this work and Tristan Mary-Huard for the careful reading of the equations. Finally, the authors thank the two anonymous reviewers for the helpful comments on this work.

Funding

This work was funded by Agri-Obtentions, Florimond Desprez and the Association Nationale de la Recherche et de la Technologie (ANRT, grant number 2019/0060).

Author information

Authors and Affiliations

Authors

Contributions

JA, FXO, BR and EH designed the field trials and collected the phenotypic data from Agri-Obtentions and INRAE. EGD provided the phenotypic data and genotyping data from Florimond Desprez company. SB provided the genotyping data from Agri-Obtentions and INRAE and participate in discussions of this study. RR initiated the project, and with JLG supervised the study and helped improving the manuscript. PR analysed the data and wrote the manuscript. All authors approved the final manuscript.

Corresponding author

Correspondence to Renaud Rincent.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Availability of data and material

The datasets generated during and/or analysed during the current study are not publicly available due to breeding programs privacy but are available from the corresponding author on reasonable request.

Code availability

Code used to lead the analysis of this study is available from the corresponding author on request.

Additional information

Communicated by Thomas Miedaner.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 6072 kb)

Appendix 1

Appendix 1

We describe in this section the analysis of two factors which also impacted the predictive ability of the phenomic selection: the size and composition of the training set (TS).

In GS, the size and composition of the TS have an impact on PA. We characterised this effect by testing six TS sizes (10, 50, 100, 150, 200, 250 genotypes) on two specific sites from the dataset Set2-2019: GL and EM. These datasets had the larger number of lines genotyped to test different TS sizes. For this, we randomly split the data in fivefolds of the same size. Onefold constituted the validation set, and the remaining folds are the genotypes potentially included in the TS. Among the latter, we randomly sampled a definite number of lines to constitute the final TS with the corresponding size. The same procedure was followed for all the folds and all the TS sizes and was repeated 25 times to give 125 predictive abilities for each TS size. We thus compared the PA of the four models M, S, CbSD, and CbSD + M representing the G-BLUP, H-BLUP and GH-BLUP model types.

The composition of a TS can be optimised in order to minimise its size while retaining similar PA. In one scenario, the size of the TS was arbitrarily defined, and either random or optimised procedures were used to select among all genotypes available, the one which will constitute the TS. The validation set was composed of the genotypes not included in the TS. We compared three optimisation algorithms to define the TS for a particular size. We tested the CDmean algorithm (CDmean) (Rincent et al. 2012), developed originally for GS, and two algorithms developed to optimise NIRS calibration equations in chemiometry, Honigs (HG) (Honigs et al. 1985) and Kennard-Stone (KS) (Kennard and Stone 1969). CDmean was applied with custom R code, while HG and KS were applied with the prospectr R package (Stevens and Ramirez-Lopez 2020). Algorithm performance was compared to randomly select (RD) training sets. PA for each TS size was averaged over 50 repetitions for CDmean or 125 repetitions for RD. There was no repetition for HG and KS as they are both deterministic. To compare these TS selection approaches, we used the datasets Set2-2019-EM and Set2-2019-GL, in which many varieties were genotyped.

We found that the effect of increasing TS size on the PA of PS and GS was substantial (Fig. 

Fig. 6
figure 6

Comparison of predictive abilities of GS and PS for GY as a function of the size of the training population for four predictive models. M refers to a G-BLUP GS model, S and CbSD to H-BLUP models, and CbSD + M to a GH-BLUP model. CbSD combined NIRS from two environments. Predictions were performed with Set2 data from two environments, Set2_2019_EM and Set2_2019_GL for each TP size. Lines indicate mean predictive abilities of a fivefold cross-validation with 25 repetitions

6). For each model and both environments, PA increased with the TS size. However, from 50 to 250, the PA increased only slightly. For the smallest TS, the S model was slightly better than the others. Regardless of the TS size, the GS model M was less accurate than the PS models. Model CbSD + M, combining both marker and NIRS data, outperformed the other models for the larger size of TS.

To optimise TS composition for a given size, we compared three different optimisation algorithms to determine the composition of the TS for performing GS or PS (Fig. 

Fig. 7
figure 7

Comparison of predictive abilities for GY as a function of the size of the training population for four predictive models and three optimisation algorithms. Algorithms using NIR spectra were Honigs (HG), Kennard-Stone (KS) and CDmean_H using the hyperspectral similarity matrix (H). CDmean_K was conducted on kinship (K) based on molecular markers. Optimisations are compared to a random selection (RD). Predictions were performed using Set2 data from two environments: 2019_EM and 2019_GL. Predictive models used were M, S, CbSD, CbSD + M (cf. Table 3). Lines represent the predictive ability with respect to TP size. Predictive ability was averaged on 50 repetitions for CDmean_K and CDmean_H, and on 125 repetitions for RD. Predictive ability for HG and KS was obtained once because they are deterministic criteria

7). When applying GS (G-BLUP model), TS optimised with CDmean computed with the kinship matrix (CDmean_K) performed better than random TS (RD) for both environments with an average gain of + 22%, a maximum of + 30% and a minimum of −4.5%. When applying PS (both H-BLUP models), TS optimised with CDmean_H computed on the NIR similarity matrix also performed slightly better than randomly sampled TS (RD). Finally, for the GH-BLUP model combining molecular markers and NIRS, CDmean_K performed slightly better than CDmean_H and RD. KS and HG performed very variably as a function of the TS size in Set2_2019_GL and with lower PA than RD in Set2_2019_EM when applying H-BLUP and GH-BLUP models.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Robert, P., Auzanneau, J., Goudemand, E. et al. Phenomic selection in wheat breeding: identification and optimisation of factors influencing prediction accuracy and comparison to genomic selection. Theor Appl Genet 135, 895–914 (2022). https://doi.org/10.1007/s00122-021-04005-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-021-04005-8

Keywords

Navigation