Introduction

Finding cultivars that can adapt to different environments and present stable and high quality production has become a priority for breeders, especially given the current climate change predictions (Damatta et al. 2018). Therefore, it is necessary to evaluate how the genotype and environment interaction (G × E) affects agronomic performance and coffee quality. The usual definition of G × E implies that interactions exist if differences between genotypes are not consistent from one environment to another (Baker 1988). The G × E influences the fruit development and ripening, the yield of the tree, the biochemical composition, and the physical aspect of the green beans (Bertrand et al. 2006, 2012b). The life span of an Arabica orchard is between 15 to 25 years. Replanting is very expensive (more than 5000 USD/ha). Furthermore, replanting cultivars that are unknown/new to farmers bear additional risks related to the uncertainties brought by global warming. As a result, the studies of G × E are increasingly more important to estimate the effects of environment on productivity and quality of new genotypes.

Coffee consumption is increasing around the world and so is the demand for quality or “Specialty Coffee” (Giovannucci and Koekoek 2003; Montagnon et al. 2012). The parameters that lead to a premium coffee quality are numerous. Coffee trees must produce coffee beans of a quality that satisfy consumers and roasters but also be productive and resistant to biotic and abiotic stress to allow farmers to reap benefits. Obtaining a quality product is subject to many hazards and processes, from the environment in which the tree grows to the final preparation of the cup of coffee. Both the genotype and the environment where the coffee shrub grows, and the post-harvest stages and roasting are key in obtaining a quality product.

Buyers and roasters at the international market define prices and qualities of the different coffee varieties according to altitude, region, bean size (screen size), bean density, bean shape and colour, number of imperfections, roast appearance and cup quality (International Trade Centre (ITC) 2011). According to several studies, higher elevations and lower air temperatures result in a higher coffee quality. Indeed, the accumulation of biochemical compounds of the green beans is modified in this type of environment and has an impact on the sensorial quality of the coffee (Guyot et al. 1996; Decazy et al. 2003; Avelino et al. 2005; Vaast et al. 2006; Bertrand et al. 2012b).

Several international protocols exist for evaluating cup quality, the SCA Standard Cupping Protocols designed for “Specialty Coffees” being the most prominent for certain type of coffee brews (e.g. Filter coffee). Increasingly, quality is determined by the presence of biochemical compounds (aroma precursors and volatile compounds) found in green beans as the green coffee bean contains all the precursor components leading to the aroma of the coffee (Joët et al. 2012; Läderach et al. 2012).

Between 1990 and 2013, CIRAD and its public and private research partners (CATIE, Icafé, ECOM Trading) created C. arabica F1 hybrid cultivars by using a selection process based on cross-breeding of American pure line cultivars and wild individuals from Ethiopia and Sudan, which were phylogenetically distant (Bertrand et al. 2012a; van der Vossen et al. 2015). In many crops, F1 hybrid contains a complete mix of the genetics of both parents. F1 hybrids are known to have a higher level of adaptability and performance due to “hybrid vigour”. In theory, this higher genetic potential also means it is more likely to be adaptable across a wide range of environments. In previous works we showed that Arabica F1 hybrids produced more than their parents (Bertrand et al. 2005) and also eliminate many of the trade-offs of the past—for example, coffee leaf rust resistance versus quality (Toniutti et al. 2017).

Some of them (especially Centroamericano-H1) are commercially distributed and starting to have some reputation among coffee producers in Central America. However, there is a knowledge gap in the scientific literature about the ability of these new genotypes to adapt to different environments. The main research question is whether these new cultivars can perform well in a wide range of environments, what would have important repercussions on meeting the supply and growing demand of high quality coffee across the world in a changing climate.

In this study, nine C. arabica F1 hybrid clones resulting from crosses of Sudanese-Ethiopian origins with American cultivars were compared with two well-known conventional American pure lines (Caturra and Marsellesa) in different environments in Nicaragua at different elevations ranging from 710 to 1250 m.a.s.l. We evaluated eleven genotypes (nine F1 hybrids and two American pure lines) in seven environments for yield and physical characteristics (green beans) and four genotypes (three F1 hybrids and one American pure line) in four environments for biochemical composition (aroma precursors and volatiles compounds—green beans) and sensory perceptions of beverage.

Our purpose is to assess how genotype × environment interactions (G × E) affect yield, canopy volume, bean physical characteristics, aroma precursors and volatile compounds of green coffee beans, and whether those differences are reflected in the sensory perception of the coffee beverage.

Materials and methods

Experimental design

The coffee genotypes were tested in multi-environment trials (MET) conducted across seven environments of Nicaragua during five years (from 2013 to 2018) (Table 1) (Isik et al. 2017). The trials were carried out directly on coffee farms. A technical itinerary was initially set up, but farmers modified it according to their financial capacities and to their environment. The environments encompassed the main growing areas in Nicaragua and reflected the variation in climate, soil, biotic conditions and agroforestry managements of the area. They ranged from 710 to 1250 m.a.s.l. Moreover, the intensity of the shade and the types of vegetation cover differed from one farm to another. Shade was estimated with the Canopeo application for Android. Shading varies from 5 to 35% depending on the farm and the season. Spacing between lines was of 2.5 m and spacing between trees on a line was 1.5 m, which corresponds to a planting density of about 2200 trees per ha.

Table 1 Characteristics of the trial (Farm name, department where the farm is located, elevation (m) for the period 2014–2017) located at Nicaragua

All the genotypes studied belong to the species Coffea arabica (Rubiaceae, Coffea). Nine F1 hybrid cultivars and two American pure lines cultivars as control were laid out in a randomized complete block design with five blocks used as replicates. On each block, 20 plants of each genotype were distributed successively.

The American cultivars cultivated in Central America, Caturra and Marsellesa®, were used as control as Caturra is the main variety in Central American plantations and is still well known for its organoleptic qualities, and Marsellesa is a new pure line variety that has been disseminated in Central America since 2015. Caturra is a natural mutant of Bourbon coffee discovered in a Bourbon field in Brazil in 1935. The variety Marsellesa is an American introgressed line derived from a cross between a Timor Hybrid and a Villa Sarchi variety. The Timor Hybrid is a natural cross between C. arabica and C. canephora, while Villa Sarchi is a natural mutant of Bourbon coffee (Bettencourt 1973; Lashermes et al. 2000).

The nine clones of F1 hybrids are described in Table 2. The male parent is always an Ethiopian or Sudan genotype. As for the female parent: two of them were traditional American pure lines Caturra and Catuai, and seven of them were introgressed lines derived from the Timor Hybrid, Marsellesa, T5296 Sarchimor and T17931 Catimor. In this study, we use the common name of the F1 clones when available (e.g. Centroamericano, Evaluna, etc..) followed by their experimental code (e.g. H1, H18, etc..).

Table 2 Genealogy of the Coffea arabica varieties tested in this study

Yield, rust resistance and canopy volume evaluation

Yield, resistance to rust and coffee plant canopy volume were evaluated for all seven localities previously described.

Yield was measured in grams of fresh berries and then expressed in grams of green coffee per tree based on the assumption that the weight of green coffee amounted to 20% of the fresh berry weight. Yield was estimated over three growing seasons (2015–2016, 2016–2017 and 2017–2018), for five plants of each genotype per block and on three blocks.

An evaluation of the rust incidence was performed for 10 plants per genotype per block and on three blocks, for one growing season (2016–2017). The assessment was made through a visual inspection on more than 100 leafs of the plants using a scale of 0–4, where 0 = absence of lesions; 1 = sporulating lesions reaching 1 to 5% of the total leaf area; 2 to 3 = gradual increase in number of diseased branches with sporulating lesions, and 4 = greater than 50% of the leaf area being affected (very susceptible cultivars may have dropped leaves before observation date) (Eskes and Toma-Braghini 1981; Virginio Filho and Astorga 2015) (Supplemental Figure 1).

The canopy volume of the coffee plants was estimated by comparing the shape of the tree to a cone. The radius (r) in cm, calculated by taking the average of the two largest plagiotropic branches, and the total height of the tree in cm, was used to estimate the conical volume V (cm3) = 1/3 × π × r2 × h (Bryant and Kothmann 1979). This parameter was measured for five plants per genotype per block and evaluated from one-year data (2017).

Physical bean characteristics

For each genotype, we evaluated the physical characteristics of green coffee beans. The assessment was made from a sample of one kg of green coffee beans harvested on five blocks. The physical characteristics of the grains have not been evaluated on the same farm scale as the sensory characteristics. The physical characteristics were evaluated from a mixture of grains of the five blocks while the sensory analyses were carried out for each block taken independently. We measured the weight of 100 healthy green beans (W100), the size of the green beans of size 16 to 20 (i.e. exportable coffee must be at least 16/64 inches in size) and the percentage of defective green beans.

Cherries harvest, post-harvest processing and bean quality evaluation

Cherries harvest and post-harvest processing

During the 2017/2018 seasons, coffee samples were harvested from four of the seven farms. For each genotype, samples of seven kg of healthy and ripe cherries were handpicked during two harvests and individually processed by the wet method (de-pulping, fermentation and drying) to obtain at least 1.1 kg of green coffee beans, with a final humidity of 11–12% moisture. The green coffee samples were screened through sieves (size 14 to 20) and beans smaller than size 15 sieve were discarded a long with defective beans. The bean quality evaluation was carried out for beans collected in the four environments described above: Las Colinas—710 m, Las Marias—1190 m, La Aurora—1240 m and Albania—1250 m. Four genotypes were studied for biochemical analyses: Caturra (as a control) and the three most promising hybrids (Centroamericano-H1, Mundo Maya-H16 and Starmaya).

Biochemical analyses

For each green bean sample, 15 g were ground into fine powder with an electric blender (A10 IKA Model) and stored in plastic tube and protected from light until extraction.

Primary metabolite analysis

Sucrose content was determined twice for each sample on 25 mg of green bean powder using the Sucrose/d-Fructose/d-Glucose Assay Kit (K-SUFRG, Megazyme International, Ireland). The powder was placed in a 15 ml conical tube containing 10–15 mg of polyvinylpolypyrrolidone (PVPP) in which 5 ml of distilled water (MilliQ, Merck, Darmstadt, Germany) was added before stirring (1h30, 225 rpm, Rotamax 120, Heidolph). One drop of Carrez A and Carrez B solution was added before centrifugation (3500 rpm, 10 min, 25 °C) and 100 µl of the filtered supernatant was introduced in a spectrophotometry cuvette containing 200 μl of β-fructosidase. The sampling was done twice for each sample and the enzymatic reactions were then realized according to the kit instructions. Absorbance was measured at 340 nm.

Total fatty acids were extracted and purified according to (Folch et al. 1957). Twenty mg of green bean powder were first extracted with 2 ml of chloroform:methanol (1:1, v/v) and 1 ml H2O, and then with 2 ml of chloroform. Polar contaminants such as proteins or nucleic acid were removed by adding 2 ml of NaCl 0.9% solution. After phase separation, the lower organic phase, which contains lipids, was harvested and the solvent was evaporated. The lipids were then resuspended in 1 ml of 2.5% H2SO4 (v/v) in methanol (and heptadecanoic acid as internal standard) to obtain fatty acid methyl esters (FAMES). Tubes were heated at 80 °C overnight and cooled to room temperature. Hexane (400 µl) and NaCl 2.5% solution (600 µl) were added to extract FAMES. Tubes were shaken vigorously and centrifuged before transferring organic phases to injection vials. GC-FID was performed using an Agilent 7890 gas chromatograph equipped with a DB-23 column (60 m × 0.25 mm × 0.25 µm; Agilent Technologies, Wilmington, DE) and flame ionization detection. The temperature gradient (Total time 33.5 min) was 50 °C for 1 min, increased to 175 °C at 25 °C/min (for 5 min), increased to 230 °C at 2 °C/min (for 27.5 min). FAMES were identified by comparing their retention time with commercial fatty acid standards (Sigma-Aldrich) and quantified using ChemStation (Agilent) to calculate the peak surfaces, and then comparing them with the C17:0 response.

Alkaloid and phenolic compound analysis

Samples were extracted by stirring (225 rpm, Rotamax 120, Heidolph, Schwabach, Germany) 25 mg of each green bean powder in 6 ml of methanol (MeOH)/H2O (80:20, v/v) at 4 °C, in the dark, for 3 h. After centrifugation (10 min, 8 °C, 3500 rpm), the methanolic extract was collected and filtered (0.25 µm porosity, Interchim, Montluçon, France) before analysis. Each extraction was carried out in triplicate.

Quantitative analyses were carried out on a HPLC system (Shimadzu LC-20, Tokyo, Japan) as described earlier (Campa et al. 2017). Extracts (10 µl) were analyzed at a flow rate of 0.6 ml/min using an Eclipse XDB C18 (3.5 µm) column (100 mm × 4.6 mm, Agilent) and an elution system composed of solvents B (H2O/MeOH/acetic acid, 5:90:5 v/v/v) and A (water/acetic acid, 98:2, v/v). Standard curves were obtained analyzing in triplicate pure standard solutions of trigonelline and caffeine from Sigma-Aldrich (St Quentin Fallavier, France) for alkaloids, 5-CQA and 3,4-, 3,5- and 4,5-O-diCQA from Sigma-Aldrich for chlorogenic acids, (+)-catechin and (−)-epicatechin from Extrasynthese (Lyon, France) for flavonoids at 10, 25, 50, and 75 µg/ml. Quantification of monocaffeoylquinic acids (3-, 4- and 5-CQA), feruloylquinic acid (FQA), coumaroylquinic acid (CoumQA) and dicaffeoylquinic acids (3,4-, 3,5- and 4,5-diCQA) was undertaken at 320 nm, caffeine and catechin derivatives at 280 nm, trigonelline at 260 nm. Concentration was calculated in mg/g dry weight by comparison with the standard curves established with respective standards and expressed in percentage of dry matter (% DW). For 3-CQA, 4-CQA, FQA and CoumQA, content was calculated taking into account the 5-CQA standard curve.

Volatile compound analysis

Extraction of volatile compounds from ground coffee by headspace-SPME

For Volatile compounds, 30 g of green coffee bean samples were ground with liquid nitrogen using an IKA M20 laboratory mill (IKA, Staufen, Germany). A CAR/PDMS (Carboxen/polydimethylsiloxane, 75 µm) SPME fibre (Supelco Co., Bellefonte, PA, USA) was used to extract volatile constituents from the coffee headspace as its affinities for all classes of aroma compounds found in coffee have been previously documented (Roberts et al. 2000; Bicchi et al. 2002; Akiyama et al. 2003), and notably for trace compounds or low molecular weight compounds. Two grams of ground coffee were placed in a 20 ml hermetically sealed glass flask, which corresponded to a headspace of 1/3 of the sampling flask and brought to room temperature for 40 min prior to sampling for headspace SPME analysis. Volatile compounds were then extracted by placing the SPME fibre in contact with the headspace for 30 min at 60 °C during continuous stirring by means of a MPS2 autosampler (Gerstel, Germany). For compound desorption, the fibre was placed in the GC injector and heated to 250 °C for 10 min. All samples were analysed in triplicate.

Combined gas chromatography–mass spectroscopy

The coffee SPME extracts were analysed on a GC–MS apparatus, a 6890A GC connected to a 5975B MS (Agilent, Palo Alto, USA) equipped with a 60 m ZB-WAX plus capillary column (film thickness: 0.25 μm; internal diameter: 0.25 mm; Phenomenex, Bologna, Italy). Injection was performed in split mode (split ratio 4:1); the oven temperature, initially set to 50 °C for 3 min, was increased to 200 °C at 4 °C/min, then raised to the final temperature of 240 °C at a rate of 20 °C/min, hold for 5 min, the carrier gas (helium) flow rate was 1.7 ml/min. The electronic impact ionisation method was used with an ionisation energy of 70 eV. The mass range scanned was 29 to 250 amu at a scanning rate of 6.1 scans/s. The transfer line temperature was 250 °C.

Identification of volatile compounds

The volatile constituents of the headspace were identified by comparing their calculated relative retention indexes with those given in the literature, and their mass spectra with those in the database (NIST11/Wiley10 libraries).

Sensory analysis

The cup quality was evaluated by sensory analysis in a sensory laboratory designed in accordance with ISO 8589 (2007). The sensory analysis, including roasting and beverage preparation, followed the protocol guidelines of the Specialty Coffee Association (SCA) http://www.SCAA.org/PDF/resources/cupping-protocols.pdf (SCAA Protocols Cupping Specialty Coffee2015).

Coffee samples were roasted one day before and grounded according to SCA guidelines. For each sample, beverage was prepared using 12.1 grams of coffee in 220 ml of water (92–95 °C) per cup for a total of 5 cups, which were evaluated by a panel of 4–6 trained judges led by a SCA Q-grader.

The sensorial evaluation was performed according the SCAA protocol, where the attributes Fragrance/Aroma, Flavor, Aftertaste, Acidity, Body, Balance and Overall were rated on scale from 0 to 10, while Defects in cup (Sweetness, Cleanliness, low uniformity) were evaluated for a maximum of 30 points per sample. The sum of all scores resulted in the final score, on a scale of 0 to 100, where coffee can be classified into: Outstanding—Specialty (90–100), Excellent—Specialty (85–89.99), Very Good—Specialty (80–84.99) and Below Specialty Quality—Not Specialty (< 80.0). Four genotypes (H1, H16, Starmaya and Caturra) were assessed in four different environments (Las Colinas—710 m, Las Marias—1190 m, La Aurora—1240 m, Albania—1250 m).

Statistical analyses

In order to analyse the genotype × environment interactions, we subjected the yield data to an Additive Main-effects and Multiplicative Interaction (AMMI) model analysis—using the R package Agricolae. The AMMI model combines ANOVA for genotype and environment main effects with principal component analysis (PCA) of the Genotype × Environment interaction with axes of principal components of interactions (IPCA) (Purchase et al. 2000). ANOVA was used to test the influences of genotype on yield, environment on yield and genotype × environment interaction on yield (Kumar Bose et al. 2014).

The AMMI Model was the following:

$$Yij = \mu + gi + ej + \mathop \sum \limits_{k = 1}^{n} \lambda k \alpha ik \gamma jk + eij$$

where Yij is the yield of the ith genotype in the jth environment, \(\mu\) is the general mean, gi is the ith genotype mean deviation, ej is the jth environment mean deviation, \(\lambda k\) is the square root of the eigen value of the PCA axis k, \(\alpha ik\) and \(\gamma jk\) are the principal component scores for PCA axis k of the ith genotype and the jth environment, respectively and eij is the residual (Zobel et al. 1988).

The AMMI stability value (ASV) described by (Purchase et al. 2000) was calculated using the interaction principal component axes (IPCA) scores as follows:

$$ASV = \sqrt {\left[ {\frac{IPCA1\;sum\;of\;square}{IPCA2\;sum\;of\;square}\left( {IPCA1\;score} \right)} \right]^{2} + \left( {IPCA2\;score} \right)^{2} }$$

where IPCA1 sum of square/IPCA2 sum of square, is the weight given to the IPCA1 value proportional to the larger contribution of IPCA1 scores to the G × E sum of squares over the ICPA2 scores. The higher the IPCA score, either negative or positive, the more specifically adapted a genotype is to certain environments. Lower ASV scores indicate a more stable genotype across environments (Kumar Bose et al. 2014).

ASV represents the distance from zero in a two dimensional scattergram of IPCA1 scores against IPCA2 scores. Since the IPCA1 score contributes more to G × E sum of squares, it has to be weighted by the proportional difference between IPCA1 and IPCA2 scores in order to compensate for the relative contribution of IPCA1 and IPCA2 scores to total G × E sum of squares. The distance from zero is determined by using the theorem of Pythagoras (Purchase et al. 2000).

Based on the rank (R) of mean bean yield of genotypes (Yi), denoted (RYi) across environments and the rank of AMMI stability value (RASVi), a selection index called Genotype Selection Index (GSI) was calculated for each genotype. GSI incorporates both mean yield and stability index in a single criterion (GSIi) as:

$$GSIi = RASVi + RYi$$

Low values of both parameters show desirable genotypes with high mean yield and high stability.

Statistical analyses were performed in R Version 3.5.1 (R Core Team 2018) and Excel.

The volume of the canopy and the quality variables were analysed using ANOVA and, when a significant effect was observed between the treatments, compared using the Tukey HSD test at 5% probability. Principal component analysis (PCA) calculated on aroma precursor variables or volatile compounds was used to describe the environments and genotypes. The final scores of the sensory analysis were used as an additional variable. Statistical analyses were performed on R Version 3.5.1 (R Core Team 2018).

Results

Yield, rust incidence and canopy volume

The average yields (three years) were compared between eleven genotypes including two pure lines and nine hybrids. We observed lower yields for both pure line cultivars Caturra and Marsellesa compared to the hybrids. Considering the average yield of each variety during the three harvest years and the seven sites, Mundo Maya-H16 showed the highest value (Fig. 1). Centroamericano-H1, followed by Evaluna-H18, Nayarita-H19 and Starmaya also showed high yields on average. With a yield 1.5 times lower than that of Mundo Maya-H16, the pure lines Caturra and Marsellesa were the least productive. According to Tukey HSD test, the yield of pure lines was statistically different from the Mundo Maya-H16 hybrid. All others hybrid cultivars presented intermediate yield values that do not differ significantly from the first and the second group.

Fig. 1
figure 1

Performance of Arabica F1 hybrids over traditional cultivars for yield, expressed as kg of green coffee beans per plant. Means were calculated over three years across seven locations. HSD Test for means: varieties with the same letter are not significantly different at p < 0.05

When analysing performance by environment, large differences were found (Fig. 2). The average values ranged from 0.231 to 0.911 kg/plant, corresponding to a yield of less than 1000 kg/ha and around 3600 kg/ha respectively. Since their altitude distinguishes the culture sites, it was easy to notice that yield was not related to altitude, with the highest and lowest values being obtained at low altitudes.

Fig. 2
figure 2

Environment influence on the yield (expressed as kg of green coffee beans per plant). Means were calculated for the seven locations across Nicaragua, over three years. HSD Test for means: varieties with the same letter are not significantly different at p < 0.05

Using the AMMI model analysis, ASV calculation allowed showing Nayarita-H19, Starmaya, Caturra, Evaluna-H18 and Centroamericano-H1 were the most stable cultivars. Furthermore, the Genotype Selection Index (GSI), which incorporates both stability and yield, pointed out Nayarita-H19, Starmaya, Evaluna-H18, Centroamericano-H1 and then Mundo Maya-H16 as the best genotypes (Table 3). Although H16 has a high stability index (ASV), its high yield allow it to have an acceptable GSI index.

Table 3 Superiority of Arabica F1 hybrids over traditional cultivars for yield stability

The relation between yield and rust incidence has been widely demonstrated (Avelino et al. 2006; Echeverria-Beirute et al. 2018). As GSI index represents both productivity and stability, we have decided to compare the GSI and the rust incidence ranking of each variety. We observed three groups. Among the five genotypes that had the lowest GSI, two (Centroamericano-H1 and Mundo Maya-H16) presented no rust lesion and the other three (Nayarita-H19, Starmaya, and Evaluna-H18) had lesions with low sporulation (Fig. 3). Even if a high GSI value could be associated with a high rust reaction for Caturra, H3 and Pakal-H17, it is not the same for Marsellesa, Mundo Mex-H15 and Totonaca-H14, which showed high GSI values associated with low rust reaction. So a high GSI was not necessarily explained by a high incidence of rust. Rust-resistant cultivars appeared to be as unstable in terms of yield as sensitive cultivars.

Fig. 3
figure 3

Relationship between Genotype Selection Index (GSI) and susceptibility to rust disease, for the 11 Coffea arabica genotypes

The volume of canopy had an impact on yield. There was a linear relationship between volume and tree yield (Fig. 4) but with interesting variations that could allow the selection of high-yielding clones with a modest increase in volume.

Fig. 4
figure 4

Relationship between yield, expressed as kg of green coffee beans per plant and tree volume, expressed in m3, for the 11 Coffea arabica genotypes

The lowest canopy plant volumes were obtained for the two pure lines Caturra and Marsellesa (0.54 and 0.56 m3 respectively), which also had the lowest yields. The F1 hybrids, Evaluna-H18, Nayarita-H19 and Mundo Mex-H15, had the highest volumes (1.02, 0.98 and 0.98 m3 respectively), about 1.8 times higher than the pure lines. Centroamericano-H1 and Mundo Maya-H16 with a volume only 20% higher than Marsellesa or Caturra produced 50% more.

Bean quality evaluation

We found significant differences between genotypes for the W100 and bean size (Fig. 5 and Supplemental Tables 1 to 4). F1 hybrids had more advantageous characteristics than Marsellesa and Caturra, since Marsellesa had the lowest W100 and Caturra had the lowest percentage of beans size 16 to 20. Starmaya and Centroamericano-H1 presented the best results for these two variables. All the genotypes studied had a low percentage of defective green beans, ranging from 2.6% (Mundo Maya-H16) to 3.8% (Starmaya) with no significant differences between genotypes or environments.

Fig. 5
figure 5

Relationship between the weight of 100 healthy green beans (g) and the size of green coffee exportable (percentage of bean size 16 to 20), for the 11 Coffea arabica genotypes across seven environments

According to the results of the SCA on roasted beans, the variety has a significant influence (p < 0.01) on the final score (Supplemental Table 5) with significantly higher average scores for F1 hybrids Starmaya and Centroamericano-H1 (respectively 80.5 and 80.4) than for Caturra (final score 76.3). The F1 hybrid Mundo Maya-H16 is intermediate and not significantly different with a final score of 79.5.

For the other organoleptic variables there was no significant effect of genotypes, except for defect in cup (p < 0.01) where Caturra presented the worst scores. Moreover, the fragrance was the only quality attribute significantly affected by environment (p < 0.05). The environment that gave the best fragrance was Albania (1250 m) and the environment that gave the least remarkable fragrance was Las Marias (Supplemental Table 6).

Whatever has led to the large bean has also lead to the higher score. We found a significant relationship between the W100 and the final sensory score (Fig. 6). According to the SCA protocol coffees that score more than 80 are considered as “Specialty coffees”. Caturra more often reached final scores of less than 80 and lower W100 compared to F1 hybrids Starmaya and Centroamericano-H1. It must be noted that at the highest altitude Albania—1250 m, the three F1 hybrids (Centroamericano-H1, Mundo Maya-H16 and Starmaya) obtained the highest scores.

Fig. 6
figure 6

Relationship between the weight of 100 healthy green beans (g) and the final sensory scores

Biochemical beans characteristics

Aroma precursors

We analysed 24 molecules in green coffee that are known to be aroma precursors once the coffee is roasted. For each of them we estimated the contribution of the environment and the genotype. Of these 24 molecules, eight were not significantly influenced by genotype (3-CQA, C14:0, C20:0, C22:0, C18:2, C18:3, C20:1 and C20:2) and five were not significantly influenced by environment (FQA, C14:0, C18:2, C18:3 and C20:2).

Seventeen aroma precursors were significantly influenced by the environment and/or by the genotype (Supplemental Tables 7 to 9). The variables, which showed significant variations due to the environment and/or genotypes, were considered in a multivariate analysis (PCA). PC1 and PC2 accounted for 56.76% of the variance. PC1 is explained by sucrose, chlorogenic acid and saturated fatty acids (Fig. 7a) and PC2 is explained by diCQA, alkaloids (especially trigonelline) and unsaturated fatty acids. The representation of samples relative to their association with genotype and environment are grouped mainly according to the different environments, with a stronger separation from the environment characterized by the lower altitude (710 m) (Fig. 7b). This latter environment is characterized by promoting more sucrose, more chlorogenic acids 3CQA and 4CQA and more saturated fatty acids (C18:0, C20:0 and C24:0).

Fig. 7
figure 7

PCA for samples of green coffee beans characterised by genotype, environment and quality, and their aggregation based on precursors of aroma. a Variables significantly influenced at p < 0.05 by environment. b Aggregation of individuals (interactions of variety, environment and sensory note) based on aroma precursors

At the highest altitudes, La Aurora (1240 m) and Albania (1250 m), caffeine and diCQA (3.5 and 4.5 diCQA) were the precursors of aromas that predominate. Las Marias, (1190 m), was characterized by coffees with higher levels of chlorogenic acid (5-CQA), diterpenes (cafestol and kahweol) and saturated and unsaturated fatty acids (C16:0 and C18:1).

Although the environments were clearly distinguishable and characterized by certain aroma precursors, it is more difficult to characterize the genotypes by specific compositions. The three F1 hybrids had similar trends within each altitude group. In most of the samples, Centroamericano-H1 appears richer in C16:0, C18:1 and 5-CQA while Caturra showed higher contents of diCQA and alkaloids (caffeine and trigonelline). The other two genotypes had intermediate behaviours. Their samples were sometimes closer to Centroamericano-H1 pattern and sometimes closer to that of Caturra. According to this analysis, the final score did not seem related to any aroma precursor composition.

Volatile compounds

All the 31 volatile compounds identified were significantly influenced (p < 0.05) by the environment and/or the genotype (Supplemental Tables 10 to 12). Therefore, all volatile compounds were considered in the main component analysis (Fig. 8a). PC1 and PC2 accounted for 68.3% of the variance. PC1 was explained mainly by alcohols (e.g. hexanol) and aldehydes (e.g. hexanal) while PC2 was explained mainly by dimethyl sulphide and ketones (2-Butanone) or at the opposite by lactones (butyrolactone or d.valerolactone). The environment at the lowest altitude (Las Colinas—710 m), were characterized by higher levels of dimethyl sulfide and ketones. In contrast, samples from Las Marias—1190 m, tend to have more lactones. Samples that were characterized by more lactones tend to be less appreciated by judges. Those with more dimethylsulfide or ketones also. The highest rated samples (Fig. 8b) did not appear to be linked to particular volatile compounds. Concerning genotypes, no particular pattern/links could be related to the composition of volatile compounds measured in this study.

Fig. 8
figure 8

PCA for samples of green coffee beans characterised by genotype, environment and quality, and their aggregation based on volatiles compounds. a Variables significantly influenced at p < 0.05 by environment. b Aggregation of individuals (interactions of variety, environment and sensory note) based on volatiles compounds

Discussion

For main stakeholders of the coffee chain (farmers, traders and roasters), decisions concerning the adoption of new cultivars must be based on scientific knowledge of the plant material, especially concerning the knowledge of G × E interactions, key knowledge for varietal improvement (Montagnon et al. 2000; Cilas and Montagnon 2008; Oliveira et al. 2014). The lack of scientific studies that take into account productivity, vigour and quality is probably one of the reasons why the adoption of new Arabica cultivars is very slow among coffee farmers (Ahmadi et al. 2013). The research in coffee breeding programmes focusing on adaptation and yield stability have been very limited and even more so for Arabica F1 hybrids involving Ethiopian/Sudan varieties and conventional or introgressed lines derived from Timor Hybrid as parents. On C. arabica genotypes, this was essentially done in Ethiopia on Ethiopian genotypes (Argaw and Taye 2018; Beksisa et al. 2018). In our study, there is a difference between Arabica F1 hybrids and traditional cultivars with a definite advantage for hybrids. F1 hybrids produce more and have better sensory quality, and in particular, some of them stand out among the hybrids. Variability within hybrids shows that some are more stable across environments, for yield and for quality. F1 hybrids, and especially some of them, fulfil their goals.

According to the results of the model-derived ANOVA, the genotype and the environment had a significant influence on yield. Besides, the influence of the environment is stronger than that of the genotype. Indeed, all varieties presented large yield fluctuations, whereas it is less important in each environment. Thus, the large differences among means, causing most of the variation in bean yield, was mainly due to environments. Taking into account the specificity of each place is therefore essential, even if a variety adapts more easily to the multiplicity of environment. In a similar way, (Beksisa et al. 2018), found that the genotype has a smaller influence than the environment in which the genotype grows. F1 hybrid cultivars produced more than the two pure line cultivars studied here. The two most productive clones are Centroamericano-H1 and especially Mundo Maya-H16, which produces more than 56% more than Marsellesa, which is a recent rust resistant pure line cultivar grown in Central America. Mundo Maya-H16 produced more whatever the environment. The genotypes that show the best yield are also the genotypes that are the most stable across the different environments. These five most productive and stable genotypes are: Mundo Maya-H16, Centroamericano-H1, Starmaya, Evaluna-H18 and Nayarita-H19. Those genotypes present the best compromise between yield and stability across environments, and especially H1 and H16. Among these performing hybrids genotypes (for yield and stability), Centroamericano-H1 and Mundo Maya-H16 are resistant to rust disease while Evaluna-H18 and Nayarita-H19 are weakly susceptible to rust. Many studies have shown that plant rust incidence is linked to productivity (Avelino et al. 2006; Toniutti et al. 2017, 2019). In our study, the GSI index (which takes into account both productivity and productivity stability) is not related to the level of rust incidence. A high or low GSI does not predict susceptibility to rust, just as being sensitive or not sensitive to rust can lead to high GSI, so could show low productivity and low productivity stability. Nevertheless, the study does identify genotypes with low GSI (high productivity and acceptable stability) that are not susceptible to rust (e.g. Centroamericano-H1 and Mundo Maya-H16). Conversely, there are cultivars that are highly resistant to rust but show low GSI (poor productivity and stability). This is the case of H14 for example. All the cultivars studied here are compact plants since they inherited of the ‘Caturra’ dwarfism gene. However due to the hybrid vigour some of them present differences in vegetative volume. It is important to relate yield to canopy volume because this volume will determine the number of trees per hectare (density), which is an important component of yield. Centroamericano-H1 and Mundo Maya-H16 are the two genotypes that have an intermediate vegetative volume, higher than the traditional cultivars Caturra and Marsellesa but smaller than the other hybrids do. A higher volume does not necessarily represent a negative aspect, especially for more extensive farming systems geared towards diversified agriculture for small farmers. In this sense, Evaluna-H18, Nayarita-H19 and Starmaya represent interesting alternative. They produced about 30–35% more than Marsellesa with a volume of about 75% higher and a good to very good stability. However, probably the most interesting variety in terms of yield by volume for low intensity cropping systems is the F1 Hybrid Starmaya. This variety produces 32% more than Marsellesa with a 43% higher canopy volume and a good stability. It is also the only F1 hybrid that multiplies by seeds, being derived from a male-sterile (Georget et al. 2019). The others F1 hybrids are clones that must be disseminated through somatic embryogenesis, which involves more technical constrains and investments (Etienne et al. 2018).

High yield is sometimes incompatible with better quality. Fewer fruits on the tree improves the quality of the beans and therefore the final quality of the coffee, while a more productive tree can reduce bean quality (DaMatta et al. 2012). The trade-off between yield and quality is therefore a difficult objective to obtain and concessions are often necessary on one or the other of the parameters. On the other hand, the sensory quality of Arabica coffee is affected by environmental conditions. In a previous study, we concluded that climate change, which generally involves a substantial increase in average temperatures in mountainous tropical regions, could be expected to have a negative impact on coffee quality (Bertrand et al. 2012b). However, the number of scientific studies comparing Arabica genotypes in several locations through their sensory attributes is low (dos Santos Scholz et al. 2013).

Criteria based on the physical characteristics of the green beans are still first-rate at the time of purchase the green coffee. Only beans of larger size are marketable for “Specialty Coffee”. A bigger bean contains a priori more aroma even if it is a controversial fact (Gonzalez-Rios et al. 2007; Kathurima et al. 2009). Nevertheless, smaller beans of the same variety are attributed to them lower grades and lower prices (DaMatta et al. 2012). On the international market, quality is determined essentially before roasting, by the size of the green beans and the number of defects. However, The W100 is widely used for other crops. For coffee, this measurement is used in some studies where it appears to be representative of the bean density and therefore of the quality of seed filling during the development and ripening of the fruit (Bertrand et al. 2005; Tran et al. 2017). Here, we studied the bean size and the weight of 100 healthy green beans. Centroamericano-H1 and Starmaya presented the best results for mean W100 and percentage of green beans of size up to 16. Genotypes that have a higher W100 are also the genotypes with the best sensory qualities. A lower W100 and lower sensory notes distinguish Caturra from hybrids as well. W100 appears to be a good cup quality predictor. This parameter is simpler to implement compared to the evaluation of the size of the green beans.

The biochemical composition of the green beans influences the sensory quality of the final coffee. Concerning the composition of aroma precursors, we showed significant differences between genotypes for some metabolites as 5-CQA or 4-CQA, di-CQA, Fatty acid and diterpenes composition. However, when considering all aroma precursors together, it is difficult to distinguish between genotypes. It seems that Centroamericano-H1 and Starmaya would be different from Caturra with more C18:1 and 5-CQA while Caturra and Starmaya are different from Centroamericano-H1 and Mundo Maya-H16 with higher contents of di-CQA and trigonelline. In any case, it was not possible here to find a link between aroma precursors and final score. Farah et al. (2006) associated a higher level of chlorogenic acids and mainly 5-CQA with a bad quality. In this study, Centroamericano-H1 and Starmaya contain more 5-CQA but are also associated with a better quality. Khapre et al. (2017) also linked caffeine with a poorer quality but they also conclude to the difficulty to establish a ‘stable’ link between genotypes, sensory and biochemical characteristics.

Green beans contain approximately 300 volatiles compounds whereas it is more than 1000 when the beans are roasted (Holscher and Steinhart 1995; Tran et al. 2016). Few studies concern volatile compounds in green coffee beans. Some volatiles compounds, in green beans as well as in roasted beans, are identified as possible markers of high quality (Toledo et al. 2016; Casas et al. 2017) whereas others are related with defects (Toci and Farah 2014; Frato 2019).

In this study, 31 volatile compounds were detected in green beans. All the VOCs varied significantly depending on the environment. The variance between genotypes was lower than the variance between environments except for d-limonene, 2-Methyl-2-buten-1-ol, 2-isobutyl-3-methoxypyrazine (IBMP), 3-Methylbutanoic acid and 3-Methylfuran. d-limonene (clean smell, characteristic of citrus fruits) and 2-Methyl-2-buten-1-ol (fruity, green lavender) are associated with positive sensory notes (Del Terra et al. 2013), whereas 2-isobutyl-3-methoxypyrazine (IBMP) is associated with ‘potato taste defect’ (Frato 2019). Starmaya and Centroamericano-H1 contain more d-limonene and are significantly different from Caturra and Mundo Maya-H16. On the contrary, Caturra contains about two times more IBMP than F1 hybrids Centroamericano-H1, Mundo Maya-H16 and Starmaya, and is significantly different from these hybrids. Iwasa et al. (2015) identified two markers in green coffee beans as possible indicator of higher coffee beverage quality, which are isomers of 3-Methylbutanoyl Glycosides, in green beans. These latter would be precursors of 3-methylbutanoic acid, which is link with a higher quality according to (Iwasa et al. 2015). In this study, Starmaya contains significantly more 3-methylbutanoic acid than Caturra. H1 and H16 are in an intermediate group. It would be interesting to analyse those compounds on a larger set of genotypes. In any case, it was not possible to distinguish varieties by their overall composition in volatile compounds. In this study, no clear pattern appeared that would link the sensory quality of genotypes and profiles of particular volatile compounds in the green coffee.

Traditional cultivars of coffee are no match with the environmental threats of the 21st century changing weather patterns, increased temperatures and new disease prevalence. This creates conditions for a potentially disastrous decline in supply in the coming decades. To spread a new genotype towards the agricultural exploitations it must necessarily fulfil several conditions, in order to meet the expectations of the various stakeholders in the sector. The genotype must be productive without neglecting quality parameters, both the characteristics of green beans (physical and biochemical) and the final sensory perception of the coffee drink. Agronomic performances and good qualities are necessary but not sufficient. To be efficient and disseminated at large scale, in different environmental conditions, breeders are looking for genotypes that present stability across environments.

Coffee plant breeding has to face many parameters in order to meet global issues and to face segmentation of the consumer market. According to this study, two genotypes, F1 hybrids Centroamericano-H1 and Mundo Maya-H16, are promising for fulfil these different goals. They are highly productive without neglecting the quality parameters: they have high yielding performance potential, they have stable yielding across environments and they are not sensitive to rust disease. Especially Centroamericano-H1 seems to present better sensory characteristics. The F1 hybrid Starmaya also deserves to be considered as a good variety because, in addition to offering interesting performances, it offers the advantage of being reproducible by seed. Therefore, this variety will be probably accessible at lower prices and so to more farmers (Georget et al. 2019). Finally, F1 hybrids Mundo Mex-H15, Evaluna-H18 and Nayarita-H19 are serious candidates for low-intensity peasant farming systems, but their sensory characteristics need to be further explored.

While F1 hybrid cultivars are still relatively new to coffee farmers and industry, they seem useful for the future. An F1 hybrid contains a complete mix of the genetics of both parents. This higher genetic potential means it is more likely to be adaptable across a wide range of environments.