Abstract
Early detection of solid tumors through a simple screening process, such as the proteomic analysis of biofluids, has the potential to significantly alter the management and outcomes of cancers. The application of advanced targeted proteomics measurements and data analysis strategies to uniformly collected serum or plasma samples would enable longitudinal studies of cancer risk, progression, and response to therapy that have the potential to significantly reduce cancer burden in general. In this article, we describe a generalizable workflow combining robust, multiplexed targeted proteomics measurements applied to longitudinal samples from the Department of Defense Serum Repository with a Random Forest machine learning method for developing and initially evaluating the performance of candidate biomarker panels for early detection of cancers. The effectiveness of this approach was demonstrated in a cohort of 175 head and neck squamous cell carcinoma patients. The outlined protocols include methods for sample preparation, instrument analysis, and data analysis and interpretation using this workflow.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Allegra E, Trapasso S, La Boria A, Aragona T, Pisani D, Belfiore A et al (2014) Prognostic role of salivary CD44sol levels in the follow-up of laryngeal carcinomas. J Oral Pathol Med 43(4):276–281
Betancourt LH, Pawlowski K, Eriksson J, Szasz AM, Mitra S, Pla I et al (2019) Improved survival prognostication of node-positive malignant melanoma patients utilizing shotgun proteomics guided by histopathological characterization and genomic data. Sci Rep 9:5154
Pereira LH, Reis IM, Reategui EP, Gordon C, Saint-Victor S, Duncan R et al (2016) Risk stratification system for oral cancer screening. Cancer Prev Res (Phila) 9(6):445–455
Li SX, Yang YQ, Jin LJ, Cai ZG, Sun Z (2016) Detection of survivin, carcinoembryonic antigen and ErbB2 level in oral squamous cell carcinoma patients. Cancer Biomark 17(4):377–382
Hsiao YC, Chi LM, Chien KY, Chiang WF, Chen SF, Chuang YN et al (2017) Development of a multiplexed assay for oral cancer candidate biomarkers using peptide immunoaffinity enrichment and targeted mass spectrometry. Mol Cell Proteomics 16(10):1829–1849
Bosley AD, Das S, Andresson T (2013) Chapter 21: A role for protein–protein interaction networks in the identification and characterization of potential biomarkers. In: Issaq HJ, Veenstra TD (eds) Proteomic and metabolomic approaches to biomarker discovery. Academic Press, Boston, pp 333–347
Rifai N, Gillette MA, Carr SA (2006) Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat Biotechnol 24(8):971–983
Shi T, Song E, Nie S, Rodland KD, Liu T, Qian WJ et al (2016) Advances in targeted proteomics and applications to biomedical research. Proteomics 16(15–16):2160–2182
Dalal V, Carmicheal J, Dhaliwal A, Jain M, Kaur S, Batra SK (2020) Radiomics in stratification of pancreatic cystic lesions: machine learning in action. Cancer Lett 469:228–237
Huang S, Yang J, Fong S, Zhao Q (2020) Artificial intelligence in cancer diagnosis and prognosis: opportunities and challenges. Cancer Lett 471:61–71
Mucaki EJ, Zhao JZL, Lizotte DJ, Rogan PK (2019) Predicting responses to platin chemotherapy agents with biochemically-inspired machine learning. Signal Transduct Target Ther 4:1
Shapanis A, Lai C, Sommerlad M, Parkinson E, Healy E, Skipp P (2020) Proteomic profiling of archived tissue of primary melanoma identifies proteins associated with metastasis. Int J Mol Sci 21(21):8160
Zhang B, He X, Ouyang F, Gu D, Dong Y, Zhang L et al (2017) Radiomic machine-learning classifiers for prognostic biomarkers of advanced nasopharyngeal carcinoma. Cancer Lett 403:21–27
Perdue CL, Eick-Cost AA, Rubertone MV (2015) A brief description of the operation of the DoD serum repository. Mil Med 180(10 Suppl):10–12
Lee JY, Shi T, Petyuk VA, Schepmoes AA, Fillmore TL, Wang Y-T et al (2020) Detection of head and neck cancer based on longitudinal changes in serum protein abundance. Cancer Epidemiol Biomark Prev 29(8):1665–1672
Lange V, Picotti P, Domon B, Aebersold R (2008) Selected reaction monitoring for quantitative proteomics: a tutorial. Mol Syst Biol 4:222
Bollinger JG, Stergachis AB, Johnson RS, Egertson JD, MacCoss MJ (2016) Selecting optimal peptides for targeted proteomic experiments in human plasma using in vitro synthesized proteins as analytical standards. Methods Mol Biol 1410:207–221
Anderson L, Hunter CL (2006) Quantitative mass spectrometric multiple reaction monitoring assays for major plasma proteins. Mol Cell Proteomics 5(4):573–588
Pino LK, Searle BC, Bollinger JG, Nunn B, MacLean B, MacCoss MJ (2020) The skyline ecosystem: informatics for quantitative mass spectrometry proteomics. Mass Spectrom Rev 39(3):229–244
Johnson WE, Li C, Rabinovic A (2006) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8(1):118–127
Miron Kursa WR (2010) Feature selection with the Boruta package. J Stat Softw 36(11):1–13
https://cran.r-project.org/web/packages/randomForest/index.html
Acknowledgments
This work was supported by Federal Award No. HU0001-16-2-0014 (Subaward No. 3879, to K.D. Rodland and T. Liu). The authors thank the clinical and laboratory staff at the Uniformed Services University of the Health Sciences and Pacific Northwest National Laboratory (PNNL). Portions of the research were performed in the Environmental Molecular Sciences Laboratory (grid.436923.9), a US Department of Energy (DOE) Office of Biological and Environmental Research national scientific user facility on the PNNL campus. PNNL is a multiprogram national laboratory operated by Battelle for the DOE under contract no. DE-AC05-76RL01830. The contents of this publication are the sole responsibility of the author(s) and do not necessarily reflect the views, opinions, or policies of the Uniformed Services University of the Health Sciences; the Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc.; the Department of Defense; or the Departments of the Army, Navy, or Air Force. Mention of trade names, commercial products, or organizations does not imply endorsement by the US Government.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Madda, R. et al. (2023). Use of Longitudinal Serum Analysis and Machine Learning to Develop a Classifier for Cancer Early Detection. In: Greening, D.W., Simpson, R.J. (eds) Serum/Plasma Proteomics. Methods in Molecular Biology, vol 2628. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-2978-9_33
Download citation
DOI: https://doi.org/10.1007/978-1-0716-2978-9_33
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-2977-2
Online ISBN: 978-1-0716-2978-9
eBook Packages: Springer Protocols