Introduction

Since the discovery of the high-risk breast cancer predisposition genes BRCA1 and BRCA2, extensive efforts have tried to identify additional breast cancer predisposition genes. Many candidate genes have been proposed but replication studies have been confirmatory for only few of them [1]. Recently, two large case-control studies were conducted in which several established and candidate breast cancer predisposition genes were tested. The BRIDGES study from the Breast Cancer Association Consortium (BCAC) tested 34 genes in 60,466 women with breast cancer and 53,461 controls [2]. In the second study, 28 genes were tested among 32,247 women with breast cancer and 32,544 unaffected women from US population-based studies in the CARRIERS consortium [3]. Results from both studies were concordant in confirming that germline protein truncating variants (PTVs) in BRCA1, BRCA2 and PALB2 are associated with high-risk of breast cancer, that PTVs in CHEK2 and ATM confer moderate risk especially for the ER-positive disease subtype, and that PTVs in RAD51C, RAD51D and BARD1 are moderate risk variants for ER-negative breast cancer. Lack of evidence of association was detected for PTVs in the great majority of the other tested candidate genes, but for one—namely FANCM—some evidence for association with ER-negative breast cancer was observed [2].

The association between a FANCM PTV and breast cancer risk was initially investigated in 2013 [4]. Since then, many case-control studies have been conducted, most based on the testing the three most common PTVs. Specifically, p.Gln1701* (c.5101 C > T) and p.Gly1906Alafs*12 (c.5791 C > T), which are expected to cause the loss of the FAAP24 binding domain in the FANCM protein C-terminus, were reported by a study of Finnish women as moderate risk variants for ER-negative and triple-negative breast cancer (TNBC) [5, 6]. In a large study of Caucasian women, we observed that the p.Arg658* (c.1972C > T), the third most common PTV, located in the protein N-terminus, was associated with moderate risk for ER-negative and TNBC subtypes, but the evidence of association for p.Gly1906Alafs*12 was inconclusive, and no evidence was observed for p.Gln1701* [7]. Overall, these and other studies [8, 9]—reviewed in Peterlongo et al. (2021)—indicate that FANCM PTVs are potential risk variants for ER-negative and TNBC subtypes; more precisely, they suggest that each PTV confers an increase risk with magnitude that may vary depending on its position in the gene or on the population genetic background [10].

While PTVs in breast cancer predisposition genes are usually considered bona fide pathogenic, missense variants (MVs) are often referred to as “variants of uncertain significance”. Their effect on protein function and cancer risk is generally unknown and difficult to estimate. Several in silico tools that predict pathogenicity of MVs have been developed that, together with additional evidence, such as frequency data, segregation analyses and functional assays, allow some MVs to be classified. However, MVs are often so infrequent that they have to be combined overall, or in subgroups based on their location in the gene domains or pathogenicity prediction score, in order to generate evidence of pathogenicity. MVs in several established and candidate breast cancer predisposition genes have been tested for association with breast cancer risk in many studies. To date, the potential association between FANCM MVs and breast cancer risk has been investigated by three studies in which all the rare variants were combined in burden analyses. Two studies were conducted using familial breast cancer cases with no BRCA1 or BRCA2 pathogenic variants and controls from the general population. The first, based on the analysis of 1207 cases and 1199 controls from France, did not find clear evidence of association with FANCM MVs (OR = 1.6; 95% CI 0.9–2.8) [11]. The second study, including 5770 cases and 5741 population-matched controls predominantly of European ancestry reported a statistically significant association with an OR of 1.50 (95% CI 1.16–1.93) [12]. In the third analysis, which was part of the BRIDGES study, rare FANCM MVs (allele frequency <0.1%) were tested in population- and family-based studies combined and separately. An association with breast cancer risk was found when comparing cases selected for family history of breast cancer and controls, with an OR estimate of 1.22 (95% CI 1.05–1.42) [2]. In the present study, we analysed further the BRIDGES data derived from the FANCM sequencing in women of European ancestry from population- and family-based studies. Specifically, we assessed 673 rare MVs with allele frequency <0.1% that were combined in burden analyses, but we also assessed individually 16 common MVs with allele frequency ≥0.1%. The burden analyses were based on the MVs’ gene domain location, and their pathogenicity prediction score. Analyses were conducted to assess associations with overall breast cancer but also the ER-negative and TNBC disease subtypes.

Materials and methods

Study sample

In this work we included women affected with breast cancer (cases) and unaffected women (controls) from 40 studies participating in the BRIDGES project (Supplementary Table S1), as previously described [2]. All 40 studies were approved by the relevant ethical review board and used appropriate consent procedures. Twenty-eight studies included cases unselected for breast cancer family history and are defined as “population-based studies”. The remaining 12 studies included cases selected because they had a family history of breast cancer, and are defined as “family-based studies”. All women included in this study were of European ancestry and older than 18 years at breast cancer diagnosis (cases) or interview (controls). We excluded women who, having a family history for breast cancer, were eligible for the BRCA1 and BRCA2 test and at the moment of the study enrollment were known to carry a pathogenic variant in these genes. We also excluded all carriers of FANCM PTVs and all women with one or more unknown FANCM MV genotypes. Thus, a total of 39,885 breast cancer cases (of which 91.6% were invasive cases, 6.2% in situ cases, and 2.2% cases of unknown invasiveness) and 35,271 controls were included in this study. Of the cases, 32,083 (80.4%) were from population-based studies and 7566 (19.0%) were from family-based studies; for the remaining 236 cases (0.6%) this information was not available. Of all cases, 5880 had ER-negative breast cancer and 2176 had TNBC.

Sequencing, variant calling and classification

The FANCM gene was included in a panel of 34 established and putative breast cancer predisposition genes that were sequenced in the context of the BRIDGES project [2]. Details of library preparation, next generation sequencing, variant calling, quality control procedures, and variant classification were described previously [2]. The FANCM MVs included in the present analyses were defined as common if their allelic frequency in controls was ≥0.1% and defined as rare if their allelic frequency in controls was <0.1%. The exact positions of FANCM functional and binding domains were derived from to UniProt database and published literature [13,14,15] (Fig. 1). Pathogenicity scores were assigned to each MV using the in silico prediction tools BayesDel [16], Combined Annotation Dependent Depletion (CADD) [17], Helix [18] and Rare Exome Variant Ensemble Learner (REVEL) [19]. The following cut-off were used to classify MVs as pathogenic: BayesDel score with MaxAF >0.069, CADD phred-scaled score ≥30, Helix score >0.50 and REVEL score >0.50.

Fig. 1: Representation of the 673 FANCM rare missense variants (MVs) with respect to the 2048 amino acid long FANCM protein.
figure 1

Functional and binding domains (MPH1, ATP-dependent DNA helicase; MHF, domain of interaction with the Histone Fold 1 and 2 (MHF1/2); MM1, motif of interaction with FANCF within the Fanconi Anemia core complex; MM2, motif of interaction with RecQ-Mediated genome Instability protein 1 (RMI1); MM3, highly conserved motif of still unknown function; FAAP24, domain of interaction with the Fanconi Anemia core complex-Associated Protein 24) are shown in dark grey and their boundaries indicated. The MVs are shown according to their position, the number of carriers in cases and controls, and by their in silico scores of pathogenicity according to BayesDel, CADD, Helix and REVEL tools; in grey are MVs predicted benign by all the tools; in black, MVs predicted pathogenic by one tool; in blue; MVs predicted pathogenic by two tools; in red, MVs predicted pathogenic by three or four tools.

Statistical analyses

To test the association between FANCM MVs and breast cancer risk, we performed logistic regression analyses adjusting for country. Common MVs were tested individually by deriving allelic odds ratios (ORs) with their corresponding 95% confidence intervals (CIs) and P values (P). Multiple testing correction was applied using Benjamini and Hochberg procedure [20]. Rare MVs were tested by burden analyses deriving ORs (with 95% CIs) comparing variant carriers with non-carriers. In this case, heterozygous and homozygous carriers were not distinguished as the number of homozygous carriers was too small to be analysed separately. We first combined all rare variants together then grouped them based on their location within functional or binding domains and by pathogenicity score. Statistical analyses for both common and rare MVs were conducted using the full sample, and separately for population- or family-based studies, and for ER-negative and TNBC case subgroups (each compared to controls) separately. Finally, we performed a fixed-effect meta-analysis combining the OR that we derived in the analysis of family-based studies with the ORs derived by the two previously published studies conducted using familial cases [11, 12]. All statistical analyses were performed using STATA version 15.1 (StataCorp LLC, College Station, Texas, USA). All tests were two-sided and P < 0.05 were considered statistically significant.

Results

A total of 689 unique FANCM MVs, of which 16 were common and 673 were rare, were detected in at least one woman from our study sample (Supplementary Table S2). All 16 common MVs were tested individually for association with breast cancer risk (Supplementary Table S3). Of these 16 MVs, seven showed a possible association (P < 0.05) with breast cancer risk or a protective effect in some of the case groups tested. But none were statistically significant after correction for multiple testing (Supplementary Table S3).

The 673 rare MVs are described and represented, based on their gene location, pathogenicity score according to four in silico tools, and the numbers of variant carriers in cases and controls, in Fig. 1. The burden analyses including all the rare 673 FANCM MVs did not indicate any statistically significant association with breast cancer risk either in the analysis of combined population- and family-based studies or when these groups were analysed separately (Table 1). The only significant association, with OR = 1.48 (95% CI 1.07–2.04; P = 0.017), was found with ER-negative breast cancer in the analysis of family-based studies. These analyses were repeated with subgroups of the variants. We firstly considered the subgroup of the 372 MVs located within the FANCM functional or binding domains but found no evidence of association. We then excluded the 76 MVs located in the FAAP24 domain and found that the 296 remaining MVs were associated with TNBC in familial studies with an OR = 2.27 (95% CI 1.15–4.47; P = 0.017). We further selected among the 296 MVs the 61 MVs predicted to be pathogenic by at least one of the four in silico tools used and found an association with TNBC with an OR = 3.51 (95% CI 1.07–11.44; P = 0.038) in the familial studies (Table 1).

Table 1 Association analyses of the 673 FANCM rare MVs with breast cancer risk overall and in ER-negative and TNBC subtypes tested in population- and family-based studies combined and separately.

Finally, we considered the two studies published so far testing the association between FANCM MVs and breast cancer risk conducted using familial designs and excluding carriers of BRCA1 or BRCA2 pathogenic variants [11, 12]. Thus, we performed a meta-analysis combining results from these studies with those from our analysis and found that all FANCM MVs combined were associated with familial breast cancer risk with OR = 1.22 (95% CI 1.08–1.38; P = 0.002, Fig. 2).

Fig. 2: Meta-analysis of studies testing the association of FANCM MVs with familial breast cancer risk and based on the analysis of 14,543 familial breast cancer cases and 42,211 controls.
figure 2

OR odds ratio, CI confidence interval, I2 percentage of heterogeneity among the studies; Phet, p value calculated using the Cochran’s Q-test for heterogeneity; P, p value of association from Z-test.

Discussion

In this study, we re-analysed the BRIDGES FANCM sequencing data assessing the breast cancer risk effects of 689 unique MVs in 39,885 European breast cancer cases and 35,271 controls from population- and family-based studies. According to their allele frequencies, these MVs were analysed either individually or by burden analyses, in the latter case combined in groups considering their gene domain location or their pathogenicity score. Also, the cases were analysed in different combinations, by study-design, and overall and for ER-negative or TNBC clinical subtypes.

Sixteen common MVs with an allele frequency ≥0.1% were analysed individually but we did not find evidence for association for any of these variants. The remaining 673 MVs were rare, with an allele frequency <0.1%. The best approach to study the risks conferred by these variants is that of combining single variant data in burden analysis and of conducting meta-analyses of different studies. Overall, our results and those from the previously conducted studies [2, 11, 12], indicate that FANCM MVs are associated with familial breast cancer risk, suggesting that these variants are low-risk susceptibility variants for breast cancer. This observation was confirmed by the meta-analysis of our and the published results [11, 12] showing that these variants were associated with familial breast cancer risk (OR = 1.22, Fig. 2). However, as studies with statistically significant results have increased likelihood of being published, we cannot exclude that this result is affected by the presence of publication bias. It is also interesting to note that a higher OR estimate, indicating moderately increased risk, was derived for the 296 MVs located within functional or binding domains excluding those in the FAAP24 (OR = 2.27, Table 1), and for the subgroup of 61 variants that among the 296 were predicted to be pathogenic by at least one of the in silico tools we used (OR = 3.51, Table 1). Further studies based on in vitro assays should be conducted to test if any of these MVs is functionally deleterious allowing to better clarify their risk effect on breast cancer. It should be noted that in the present study, as well as in the previously published ones, the association of FANCM MVs was only found in family-based studies. While this supports the hypothesis that FANCM MVs are breast cancer risk factors, the ORs we found in this study are an overestimate of the risks these variants confer. The lack of associations of FANCM MVs with breast cancer risk in the analyses of only population-based studies could be explained by the presence of other unmeasured risk variants aggregating in families that may interact with FANCM MVs. Results from the analysis of family- and population-based studies combined are similar to those of the analysis of only population-based studies as familial cases represent only the 19% of all the cases included.

In conclusion, our data suggest that at least some of the FANCM MVs—in particular those located in some gene domains and classified as pathogenic in silico—could be risk variants for ER-negative breast cancer in familial settings. Larger association studies and, functional assays may be helpful to better clarify these MVs effects on breast cancer risk. Overall, our results showed that perturbation of the FANCM gene has an impact on breast cancer risk, reinforcing the knowledge that FANCM is a breast cancer gene predisposing especially to develop ER-negative and TNBC disease subtypes.