Introduction

The E-cadherin protein (encoded by the CDH1 gene) is normally expressed in breast epithelial tissue and functions as a critical part of epithelial cell adhesion and epithelial-to-mesenchymal transition (EMT)1,2,3. Due to the frequent loss or inactivation of E-cadherin that is evident in epithelial cell cancers, E-cadherin is thought to have tumor-suppressor properties where loss is associated with carcinogenesis and invasion4,5.

Loss of E-cadherin expression is commonly used to confirm lobular histology that comprise 10–15% of all breast cancers6,7,8, which have also been noted to more frequently express hormone receptors [estrogen receptor (ER) and progesterone receptor (PR)] than E-cadherin high tumors9. Recent molecular profiling analysis of lobular compared with ductal cancers show E-cadherin mutation and loss to be a defining feature of lobular breast cancers, and suggest it to be a distinct molecular subtype of breast cancers10. Although many studies have identified heterogeneity in risk factor associations based on breast tumor subtypes defined by hormone receptor status (e.g. ER-positive vs. ER-negative) or histology (lobular vs. ductal)11,12,13,14,15,16,17,18,19,20,21,22, limited data have examined whether E-cadherin may define important subgroups of tumors with distinct etiologies23.

Data also suggest that loss of E-cadherin expression may be associated with malignant progression, metastasis, and reduced survival in breast cancer patients24,25,26,27,28; however, most of these studies were based on small numbers and not all analyses were stratified by ER-status, a known important prognostic and predictive marker.

Evidence on whether E-cadherin may be an important marker of etiologic heterogeneity and survival differences across the spectrum of breast tumor subtypes including ER status and histology is limited. In 1984, Prentice et al.29 introduced the concept of using case-case studies for identifying disease risk factors. In a case-case study design, information is obtained only from cases of a particular disease–in our study, breast cancer–and is used as a tool to assess etiologic heterogeneity. In this study, we were interested in determining whether known risk/protective factors for breast cancer differed by E-cadherin expression, in order to provide new insights into possible mechanisms for E-cadherin loss in breast carcinogenesis similar to analyses previously done for ER, PR and HER2 markers23,30,31. Using immunohistochemical (IHC) data for E-cadherin performed centrally using tumor tissue microarrays (TMAs), we performed a large pooled analysis of 12 studies participating in the Breast Cancer Association Consortium (BCAC), and examined whether established breast cancer risk factor associations and survival differed by low vs. high E-cadherin tumor tissue expression, stratified by ER status and histology.

Material and Methods

Study Population

Descriptions of the 12 breast cancer studies participating in the Breast Cancer Association Consortium (BCAC) included in this analysis are detailed in Supplemental Table 1. Case-case analyses were restricted to 5,933 European women from 12 breast cancer studies with invasive breast cancer who provided data on age at diagnosis and had evaluable E-cadherin tumor tissue staining results (See section on E-cadherin tumour tissue measurements). Study participants were recruited under protocols approved by the Institutional Review Board at each institution, and all subjects provided informed consent or did not opt-out, depending on national regulations. All methods were performed in accordance with the relevant guidelines and regulations and a list of ethical approval committees are listed at the end of this manuscript.

Risk factor information

The 12 participating studies provided information on one or more of the following risk factors for breast cancer: family history of breast cancer in first-degree relatives, reproductive factors including age at menarche, parity, age at first full-term birth, oral contraceptive (OC) use among women ≤50 years of age, menopausal hormonal use, type of ever menopausal hormone used, and anthropometric measures including body mass index (BMI), and height. As a proxy for menopausal status, we used age ≤50 and >50 as a proxy for pre- and postmenopausal status respectively, since not all studies captured menopausal status.

E-cadherin tumor tissue measurements

Routinely prepared formalin-fixed paraffin-embedded (FFPE) blocks of invasive breast tumors were used to construct TMA blocks at each study center. One-hundred and forty-two TMA slides with tumor samples from 6,010 individual patients were prepared for E-cadherin staining (ranging from 1–2 cores per patient). TMA’s from all participating studies were stained centrally in the Experimental Pathology Laboratory at the National Cancer Institute (NCI) to allow for consistency across sites and avoid any potential batch effect that may arise due to systematic variation in staining procedures. We recognized the study is unable to control for pre-analytic variables in tissue fixation and processing. However, to address this, the Experimental Pathology Lab at NCI carefully re-titrated the IHC assay to provide a stable assay across all samples.

IHC staining was performed on a Benchmark ULTRA autostainer (Ventana Medical Systems, Tuscon, AZ). TMA sections were deparaffinized with zylene and graded alcohols; antigen retrieval was mediated with citrate buffer pH 9 (Dako) for 20 minutes in a pressure cooker. Primary mouse monoclonal antibody, anti-E-cadherin (clone NCH-38, 1:500; Dako, Carpinteria, CA) was applied at room temperature for 2 hours. The antigen-antibody complex was detected using Envision + (Dako) and DAB was applied for 20 minutes. Slides were counterstained with hematoxylin, dehydrated and coverslipped. Slides were imaged with a Hamamatsu Nanozoomer (Bridgewater, NJ), at 20× magnification and cataloged using the SlidePath Digital Image Hub (Leica Biosystems, Wetzlar, Germany).

As our primary interest was in investigating clinically relevant expression of E-cadherin expression, we used the H-scoring system as has been proposed and evaluated in previous publications23,24,25. Two cytotechnologists assessed digital images of TMA spots using the SlidePath Digital Image Hub, blinded to any clinical data. Manual readings of each TMA spot recorded the quality of the image (unsatisfactory, limited or satisfactory), percentage of cells positively stained for E-cadherin (0, 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100%) and the intensity of staining (0 = negative, 1 = weak, 2 = intermediate, and 3 = strong). Reproducibility of IHC scoring was assessed based on evaluation of 200 images, by the two cytotechnologist and a pathologist (M.E.S.); inter- and intra-observer agreement was excellent (weighted kappa ≥90%; P < 0.001). A summary E-cadherin score was calculated using the product of % positive tumor cells and intensity (range of 0–300)23,24,25. For patients with multiple spots, the maximum E-cadherin score across the spots was calculated for analysis. The median and interquartile range for the E-cadherin score did not vary substantially across the 12 studies (Supplementary Table 2). Tumors having a score of <100 were classified as E-cadherin low and those with a score ≥100 as E-cadherin high. Representative images are shown in Supplementary Figure 1. This cut-point was informed by the known relationship between E-cadherin expression and lobular histology supported by evidence in the literature23,32,33. Further for sensitivity analysis, we also evaluated a more stringent cut-point defining E-cadherin loss with a score = 0 (i.e. no expression of E-cadherin).

Assessment of other tumor markers

Assessment of the tumor markers ER, PR and human epidermal growth factor receptor 2 (HER2), and the definition of positive expression of the tumor markers varied across studies. For the majority of cases (N = 1891, 32%), ER status were primarily extracted from medical records, 15% (N = 908) had ER obtained from IHC staining of whole sections and 28% (N = 1685) had ER obtained from IHC staining of TMAs. Previous publications from participating groups in the current study show good concordances between marker status from medical records and standardized measurements from TMA analyses34,35,36.

Statistical analysis

Case-case analyses

As we performed all staining at the NCI to minimize batch effects, a pooled analysis was conducted using data from all 12 studies. We performed case-case analyses to assess whether there was heterogeneity in risk factor associations by E-cadherin breast tumor expression. We used logistic regression models to estimate case-case odds ratios (OR) and 95% confidence intervals (CIs) where E-cadherin low vs. E-cadherin high expression was the outcome and risk factors the explanatory variables. The ORs were interpreted as the risk factor associations of E-cadherin low disease compared to E-cadherin high disease. For each risk factor, the category that has been shown to be associated with the lowest overall breast cancer risk in the literature was selected as the reference category. Thus, the case-case OR >1 can be interpreted to mean that the risk factor examined in the analysis is more strongly associated with E-cadherin low tumors than with E-cadherin high tumors (ORE-cadherin low vs. control > ORE-cadherin high vs. control). Heterogeneity by E-cadherin subtype were tested using global F test37. Because E-cadherin expression may vary by age and study site23,30,31 all models were adjusted for age (in 10-year categories) and study site. Given that ER status is an important marker of etiologic heterogeneity38, we stratified all analyses by ER status (ER+, ER−). Among ER-positive tumors, we also evaluated associations after stratification by histology (lobular, ductal/mixed); this was not done among ER-negative tumor due to small numbers. To assess the variation in results by study for risk factors that showed evidence of a differential association by E-cadherin expression, we fitted study*risk factor interaction terms in the models to estimate p-heterogeneity by study using the likelihood ratio test; P < 0.20 was considered suggestive evidence of between-study heterogeneity37. In sensitivity analysis, we also assessed associations with risk factors using a more stringent definition of E-cadherin loss, where loss of E-cadherin was defined as those cases with a score of 0.

Survival analysis

For survival analysis, we further excluded patients with distant metastases at diagnosis of the primary tumor (N = 63) and those who were missing vital status (N = 174). In total, 5,696 invasive breast cancer cases from 12 BCAC studies were included in the survival analysis. A total of 1,085 deaths were observed within 10 years of diagnosis, 671 due to cancer. We calculated the survival time for each case as the difference between the date of diagnosis and the date of death or censoring. Analyses were left censored for time to study entry to allow for inclusion of prevalent cases. End of follow-up was defined as the date of death, date of last follow-up or 10 years, whichever came first. Hazard ratios (HR) and 95% CIs for all-cause mortality and breast cancer-specific mortality were estimated using Cox regression models, using study site as a stratifying factor. Multivariable Cox models were adjusted for potential confounders: age at diagnosis (in 10-year categories), tumor grade (well/moderately differentiated, poorly differentiated, or unknown), tumor size (≤2, >2 cm, or unknown), node status (positive, negative, or unknown), HER2 status (positive, negative, or unknown), and histology (ductal/mixed, lobular, other, or unknown). To assess whether the associations vary by tumor characteristics, we also estimated HRs and 95% CI by ER status (positive, negative, unknown), HER2 status (positive, negative, unknown), and, among ER-positive tumors, histology (lobular, ductal/mixed, other/unknown).

All statistical tests were two-sided with 5% type-I error. All pooled analyses were performed using the SAS software version 9.3 (SAS Institute, Inc, Cary, NC).

Results

Study and tumor characteristics by E-cadherin expression

The median age at breast cancer diagnosis was 52 years with some variation by study. E-cadherin low expression by study ranged from 10% to 31% (Supplementary Table 2).

Table 1 presents the distribution of clinicopathologic features by level of E-cadherin tumor tissue expression (low/high). E-cadherin low tumors were more likely to be lobular, well/moderately differentiated (low grade), larger in size (>2 cm), and HER2-negative compared to E-cadherin high tumors (P ≤ 0.005; Table 1). These associations were generally consistent across studies (Supplementary Table 3).

Table 1 Distribution of select clinicopathologic features among breast cancer cases by E-cadherin tumor tissue expression levels in the 12 participating BCAC studies.

Case-case analyses of risk factor associations with E-cadherin tissue expression among ER-positive tumors overall and stratified by histology

Table 2 presents risk factor associations for ER-positive breast cancers overall and stratified by histology. Among ER-positive cases, compared with E-cadherin high tumors E-cadherin low tumors were marginally associated with ever use of menopausal hormones compared with never users (OR = 1.24, 95% CI = 0.97–1.59, P-het = 0.08). No consistent associations were observed for E-cadherin status by age at menarche, number of live births, age at live birth, or anthropometric measurements (BMI and height; Supplementary Table 4).

Table 2 Case-case analyses of established breast cancer risk factors with E-cadherin tumor tissue expression (low/high) stratified by tumor histology among estrogen receptor (ER)-positive tumors.

Among women with ER-positive tumors, we observed a difference by E-cadherin status for number of live births; women who had 1-birth were less likely to have E-cadherin low expression than women with 2 or more births (OR = 0.74, 95% CI = 0.58–0.95, Table 2); however no trend was present, based on the result of nulliparous women. This relationship was driven by the ductal/mixed tumors while in contrast, for lobular tumors nulliparous women had more frequent loss of E-cadherin compared to those with two or more live births. Other breast cancer risk factors examined, family history of breast cancer, age at menarche, age at menopause, age at first birth, OC and menopausal hormone use, did not exhibit heterogeneity by E-cadherin expression.

Among ER-positive breast cancers of ductal histology, no breast cancer risk factors examined exhibited heterogeneity in their associations by E-cadherin expression (Table 2 and Supplementary Table 4). Analyses using a score of 0 to define E-cadherin loss are presented in Supplemental Table 5–6. In these sensitivity analysis, we observed a stronger relationship with E-cadherin loss with ER expression (Supplemental Table 5), and analysis by risk factors (Supplemental Table 6) showed ever use of menopausal hormones more likely to have E-cadherin loss (defined as score = 0) compared to never users among ER-positive tumors (OR = 1.57, 95% CI = 1.06–2.33, p = 0.02), other factors did not show significant differences.

Case-case analyses of risk factor associations with E-cadherin tissue expression among ER-negative tumors

We did not find differences by E-cadherin expression among ER-negative breast cancers (Table 3), although the result for OC use was marginal. Cases that reported ever use of OC’s were more likely to be E-cadherin low compared with E-cadherin high tumors (OR = 1.97, 95% CI = 0.96–4.06, P-het = 0.06). No significant associations for E-cadherin status were observed for anthropometric measurements including BMI and height (Supplementary Table 4). There were too few cases to evaluate with this more stringent cut-point of 0 to define E-cadherin loss for ER-negative cases.

Table 3 Case-case analyses of established breast cancer risk factors with E-cadherin tumor tissue expression (low/high) among ER-negative tumors.

E-cadherin expression and survival by tumor subtypes

The mean follow-up time was 9.6 years and results for all-cause and breast cancer specific survival are presented in Table 4. E-cadherin expression showed no significant associations with survival in multivariable models overall, or in any of the tumor subtypes.

Table 4 10-year Hazard ratios (HR) and 95% confidence intervals (CI) for all-cause and breast cancer specific mortality according to E-cadherin tumor expression (low/high): a pooled analysis of 12 participating Breast Cancer Association Consortium studies.

Discussion

In our study of nearly 6,000 breast cancer patients, with centrally stained and scored TMA slides, our analyses demonstrated E-cadherin loss was significantly associated with lobular histology consistent with previous work. We found limited evidence of heterogeneity of E-cadherin loss, except for menopausal hormone use, to vary by risk factors or with 10-year breast cancer specific survival within tumor subtypes.

Analysis by tumor characteristics and E-cadherin loss showed significant associations with lobular histology, low grade, larger tumor size and lack of HER2 staining, consistent with previous studies39. Lobular breast cancers feature noncohesive cells that are individually dispersed or arranged in a single file pattern, a phenotype that has been attributed to dysregulation of cell-cell adhesion, primarily by loss of E-cadherin protein expression40,41. Lobular breast cancers because of their single file pattern tend to be harder to detect in screening and hence, are larger when diagnosed39, consistent with our data showing E-cadherin loss associated with larger tumor size.

We also observed a relationship between low E-cadherin expression and ever use of menopausal hormones among ER-positive tumors, which was more pronounced when we used a more stringent cut-point of E-cadherin loss (with a score of 0). Numerous studies have shown that menopausal hormone use, particularly combined estrogen-progestin therapy, to be more strongly associated with lobular tumors than with ductal tumors and that reduced use of menopausal hormones is associated with a declining incidence rate of lobular cancers at the population level11,12,13,15,16,17,19,20,21,22,42,43. Further, we also observed among ER-negative breast cancers use of OCs compared to never users to be almost twice as likely to have E-cadherin loss, although not statistically significant. Given that findings suggesting that the relationship between menopausal hormones or OC and breast cancer risk are strongly influenced by recent exposure, it is possible that a true association in our data was attenuated by our reliance on ever as opposed to current use. Given prior epidemiologic studies and in vitro data showing that estrogen may lower E-cadherin expression, the observed relationship with OC or menopausal hormone use may be plausible44.

From our analysis of breast cancer risk factors among ER-positive tumors we observed that women who had one birth were less frequently E-cadherin low compared to women who had two or more live births. We saw an opposite relationship among lobular tumors where nulliparous women had more frequent loss of E-cadherin compared to those with two or more live births. Whether E-cadherin loss in tumors is related to reproductive characteristics requires larger datasets. With regards to genetic factors, mutation profiling studies targeted at CDH1 suggest mutations to be rare and unlikely to explain loss (33/507 based on TCGA data)45,46.

We did not observe associations between E-cadherin and breast cancer specific survival in multivariable models as reported in previous studies24,25,26,27,28. Our data, based on the largest analysis of its kind, do not support E-cadherin as an important marker of survival in breast cancer patients.

Strengths of our study include the use of large, pooled analysis, centrally stained and scored E-cadherin data which allowed for the reduction of any systematic bias that may have been introduced across participating studies. Limitations of this study include limited power for analysis of tumor subgroups due to smaller numbers especially for survival analysis. Although staining of E-cadherin was performed centrally, we had lower than expected percentages of E-cadherin associated lobular breast cancers, which may reflect variation in calling of histologic subtypes, but could also indicate the need for molecular profiling methods or more detailed image analysis studies on the compartment of where E-cadherin is stained needed, for defining E-cadherin loss47,48. In fact, the percentage of E-cadherin low lobular carcinomas varied by study, which could reflect tissue factors that influenced staining, variability in classification of cancers as lobular including potential sampling issues if TMA’s did not capture fully lobular morphology, or factors related to populations and relative frequency of risk exposures.

In summary, this study provides limited evidence for heterogeneity in risk factor associations or for differences in survival by E-cadherin tumor tissue expression. Our data are consistent with molecular profiling studies showing distinctive expression of genes associated with E-cadherin signaling among ER-positive ductal and lobular carcinomas10,49. Evaluating genetic susceptibility markers and E-cadherin loss, where data suggest that genetic susceptibility factors may influence loss of E-cadherin expression, might provide new insights on pathways of E-cadherin loss consistent with histology analysis23,50. Future studies using comprehensive molecular subtyping data, including histology, hormone markers and mRNA, might provide new insights on common and distinct molecular pathways of E-cadherin loss as well as tumor heterogeneity51.

Declarations

Ethics approval and consent to participate

Study participants were recruited under protocols approved by the Institutional Review Board at each institution, and all subjects provided informed consent or did not opt-out, depending on national regulations.

Amsterdam Breast Cancer Study (ABCS), Netherlands Leiden University Medical Center (LUMC) Commissie Medische Ethiek and Protocol Toetsingscommissie van het Nederlands Kanker Instituut/Antoni van Leeuwenhoek Ziekenhuis; Spanish National Cancer Centre Breast Cancer Study (CNIO-BCS) Spain Hospital Universitario La Paz Comite Etico de Investigacion Clinica; ESTHER Breast Cancer Study (ESTHER) GermanyRuprecht-Karls-Universitat Medizinische Fakultat Heidelberg Ethikkommission; Helsinki Breast Cancer Study (HEBCS) Finland Helsingin ja uudenmaan sairaanhoitopiiri (Helsinki University Central Hospital Ethics Committee); Kuopio Breast Cancer Project (KBCP) Finland Pohjois-Savon Sairraanhoitopiirin Kuntayhtyma Tutkimuseettinen Toimikunta; Kathleen Cuningham Foundation Consortium for Familial Breast Cancer/Australian Ovarian Cancer Study (kConFab/AOCS) Australia kConFab: The Queenland Institute of Medical Research Human Research Ethics Committee (QIMR-HREC) AOCS: Peter MacCallum Cancer Centre Ethics Committee; Mayo Clinic Breast Cancer StudY (MCBCS) USA Mayo Clinic IRB; Leiden University Medical Centre Breast Cancer Study (ORIGO) Netherlands Medical Ethical Committee and Board of Directors of the Leiden University Medical Center (LUMC); NCI Polish Breast Cancer Study (PBCS) Poland National Institute of Health (NIH) IRB; Prospective Study of Outcomes in Sporadic Versus Hereditary Breast Cancer (POSH) UK South West Multi-centre Research Ethics Committee; Rotterdam Breast Cancer Study (RBCS) Netherlands Medische Ethische Toetsings Commissie Erasmus Medisch Centrum; UK Breakthrough Generations Study (UKBGS) UK South East Multi-Centre Research Ethics Committee.

Availability of data and materials

The datasets generated and/or analyzed during the current study are not publicly available due to privacy and ethical approvals but are available from the corresponding author on reasonable request.