Elsevier

Pathology

Volume 49, Issue 4, June 2017, Pages 371-378
Pathology

Anatomical pathology
Validity and reliability of Ki-67 assessment in oestrogen receptor positive breast cancer

https://doi.org/10.1016/j.pathol.2017.02.001Get rights and content

Summary

Ki-67 is a prognostic and predictive biomarker in oestrogen receptor positive breast cancer. However, its measurement is not well standardised. This study compared the validity, intra- and inter-observer reproducibility and reporting time of five methods of Ki-67 assessment on tissue microarrays (TMA) and whole slides. Ki-67 labelling index (LI) was assessed on 71 breast carcinomas of no special type (NST), using five methods: manual counting (gold standard), unaided visual estimation, visual estimation aided by reference photographs, semi-manual digital image analysis (DIA) and fully automated DIA (Aperio platform). On TMA, semi-manual DIA demonstrated the closest agreement with the gold standard [intra-class correlation coefficient (ICC)=0.99 (95% confidence interval 0.98–0.99)]. All other methods also demonstrated close agreement [unaided estimation ICC=0.92 (0.90–0.93), aided estimation ICC=0.93 (0.92–0.95), fully automated DIA ICC=0.97 (0.96–0.97)]. On whole slides, both aided estimation and semi-manual DIA demonstrated excellent agreement with the gold standard [aided visual estimation ICC=0.91 (0.85–0.94), semi-manual DIA ICC=0.94 (0.89–0.96)]. Aided visual estimation significantly improved inter-observer reproducibility compared to unaided estimation [unaided ICC=0.87 (0.80–0.92); aided ICC=0.96 (0.93–0.97)] and corrected the underestimation bias seen in unaided estimation. Importantly, validity and reproducibility on whole slides were lower than on TMA for all methods of assessment, suggesting that field selection is an important source of variability in Ki-67 assessment. Values close to clinically used cut-off values therefore should be interpreted with caution.

Introduction

Kiel-67 antigen (Ki-67) is a well-studied prognostic and predictive biomarker in breast cancer. It is a nuclear protein expressed in all phases of the cell cycle except G0, and therefore indicates which cells are proliferating.1, 2 Ki-67 expression is assessed by immunohistochemistry as the Ki-67 labelling index (LI), which refers to the percentage of positively stained tumour cell nuclei.3 The test is inexpensive and readily available in diagnostic pathology laboratories, allowing for rapid turnaround times to facilitate clinical decision-making.

Numerous clinical applications for Ki-67 have been proposed. The prognostic value of Ki-67 for disease-free survival and overall survival in early oestrogen receptor (ER) positive breast cancer has been consistently demonstrated.4, 5, 6 A role for the prediction and monitoring of neoadjuvant therapy is also emerging, particularly for endocrine therapy.7, 8, 9 The importance of Ki-67 expression has been supported by gene expression profiling studies. For example, expression of MKI67 (the gene encoding the Ki-67 protein) and other proliferation-associated genes helps to distinguish between the luminal A and luminal B intrinsic subtypes of breast cancer.10, 11 Moreover, MKI67 expression is included in gene expression profiling-based tests which predict benefit from chemotherapy, such as Oncotype DX, Genomic Grade Index and PAM50.12, 13, 14 However, because of the high cost of these tests (e.g., Oncotype DX costs $4000 in Australia15), there is interest in the use of immunohistochemical profiling, including measurement of Ki-67, ER, progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) as cost-effective surrogate markers for identifying intrinsic subtypes and calculating recurrence risk.16, 17

However, there is concern about the analytical validity of Ki-67,18 and there is a lack of consensus regarding its measurement. Variability in pre-analytic, analytic and scoring protocols makes it difficult to implement the cut-off values for clinical decision-making proposed in the literature.1, 2, 3, 19 Furthermore, the relationship between Ki-67 LI assessed on whole slides and tissue microarrays (TMA) is not well studied, despite the widespread use of cut-off values established by studies which assessed Ki-67 LI on TMA.16 Although the International Ki-67 in Breast Cancer Working Group has published consensus guidelines for the assessment of Ki-67,3 these recommendations have not been widely implemented.20 Of note, the proposed gold standard is manual counting of at least 1000 cells at high power,3 which is labour intensive and may imply a false sense of precision. Visual estimation has been proposed as a rapid alternative, but its validity and reliability are disputed.21, 22, 23 Digital image analysis (DIA) is emerging as a highly reproducible technique, but is not yet widely adopted.21, 22, 23, 24, 25

We performed a concordance study to compare five different methods of Ki-67 assessment, including the gold standard and different methods of visual estimation and DIA. We assessed validity, intra- and inter-observer reproducibility and reporting time to determine which is most appropriate for use at our institution. We also compared the use of TMA with whole slides.

Section snippets

Ethics and patients

The study was performed on a random series of 71 invasive breast cancers diagnosed at the Austin Hospital, Melbourne, Australia. Data were extracted from the Kestral pathology database to identify cases of breast cancer diagnosed between 1 January 2005 and 12 December 2010. Eligibility criteria were: invasive carcinoma of no special type, ER positive. Tumour grade (Elston-Ellis modification of Scarff-Bloom-Richardson grade; BRE), ER, PR and HER2 characteristics were extracted from routine

Patient and tumour characteristics

The median patient age at diagnosis was 53 years. All patients were female and all the tumours were ER positive; most (90%) were also PR positive. A minority were HER2-amplified (10%). There were 16 BRE grade 1 tumours, 28 grade 2 tumours and 27 grade 3 tumours. The median Ki-67 LI was 29% on whole slides and 17% on TMA, as assessed by manual counting. Ki-67 LI was significantly correlated with BRE grade (Spearman rank correlation=0.57, p<0.001), particularly the mitotic count component

Discussion

In our study, all methods tested demonstrated a high degree of validity and reliability for Ki-67 assessment on TMA, particularly manual counting and semi-manual DIA. However, manual counting proved to be highly labour intensive. On whole slides, both visual estimation with the aid of reference photographs and semi-manual DIA demonstrated high validity and reliability. The use of reference photographs significantly improved validity and inter-observer reproducibility on whole slides compared to

Conclusions

Standardisation of Ki-67 assessment is essential in order to translate studies of the prognostic and predictive use of Ki-67 into clinical practice. Our study validated the use of semi-manual DIA for the assessment of Ki-67 and established parameters for the use of semi-manual DIA using the Aperio platform at our institution. Whilst DIA algorithms require tuning in order to account for inter-laboratory variations in immunohistochemistry and slide scanning protocols, we have demonstrated that

Acknowledgements

Many thanks to the laboratory staff at the Department of Anatomical Pathology, Austin Hospital.

References (39)

  • Y. Liu et al.

    The clinical significance of Ki-67 as a marker of prognostic value and chemosensitivity prediction in hormone-receptor-positive breast cancer: a meta-analysis of the published literature

    Curr Med Res Opin

    (2013)
  • P.A. Fasching et al.

    Ki67, chemotherapy response, and prognosis in breast cancer patients receiving neoadjuvant treatment

    BMC Cancer

    (2011)
  • M.J. Ellis et al.

    Outcome prediction for estrogen receptor–positive breast cancer based on postneoadjuvant endocrine therapy tumor characteristics

    J Natl Cancer Inst

    (2008)
  • C.M. Perou et al.

    Molecular portraits of human breast tumours

    Nature

    (2000)
  • T. Sorlie et al.

    Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications

    Proc Natl Acad Sci USA

    (2001)
  • S. Paik et al.

    A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer

    N Engl J Med

    (2004)
  • C. Sotiriou et al.

    Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis

    J Natl Cancer Inst

    (2006)
  • J.S. Parker et al.

    Supervised risk predictor of breast cancer based on intrinsic subtypes

    J Clin Oncol

    (2009)
  • R.H. de Boer et al.

    The impact of a genomic assay (Oncotype DX) on adjuvant treatment recommendations in early breast cancer

    Med J Aust

    (2013)
  • Cited by (4)

    • Proliferation Activity in Canine Gastrointestinal Lymphoma

      2021, Journal of Comparative Pathology
      Citation Excerpt :

      There is an ongoing debate on the optimal method for determining this index in terms of accuracy and practicality. In human medicine, manual counting of Ki67-positive tumour cells has been used as the gold standard (Klapper et al, 2009; Jing et al, 2017). However, as manual counting is very time-consuming, more practical measures are required.

    • Laboratory validation studies in Ki-67 digital image analysis of breast carcinoma: a pathway to routine quality assurance

      2019, Pathology
      Citation Excerpt :

      The widespread clinical uptake of Ki-67 assessment has been hampered by issues of accuracy, reproducibility and standardisation.8,9 Specifically, there are issues with Ki-67 PI measurement [including visual assessment vs digital image analysis (DIA)], interpretation (lack of consensus regarding the cut-off value between ‘low’ and ‘high’ Ki-67 PI) and selection of the Ki-67 antibody used (different Ki-67 antibody clones produce subtly different Ki-67 PI values).8,10,11 An International Ki-67 in Breast Cancer Working Group recently devised a set of recommendations addressing pre-analytical, analytical and interpretation factors in the hope of standardising practice, improving accuracy and reducing variability.8

    View full text