ABSTRACT
The differential analysis is the most significant part of RNA-Seq analysis. Conventional methods of the differential analysis usually match the tumor samples to the normal samples, which are both from the same tumor type. Such method would fail in differentiating tumor types because it lacks the knowledge from other tumor types. The Pan-Cancer Atlas provides us with abundant information on 33 prevalent tumor types which could be used as prior knowledge to generate tumor-specific biomarkers. In this paper, we embedded the high dimensional RNA-Seq data into 2-D images and used a convolutional neural network to make classification of the 33 tumor types. The final accuracy we got was 95.59%. Furthermore, based on the idea of Guided Grad Cam, as to each class, we generated significance heat-map for all the genes. By doing functional analysis on the genes with high intensities in the heat-maps, we validated that these top genes are related to tumor-specific pathways, and some of them have already been used as biomarkers, which proved the effectiveness of our method. As far as we know, we are the first to apply a convolutional neural network on Pan-Cancer Atlas for the classification of tumor types, and we are also the first to use gene's contribution in classification to the importance of genes to identify candidate biomarkers. Our experiment results show that our method has a good performance and could also apply to other genomics data.
- Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek . 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one Vol. 10, 7 (2015), e0130140.Google ScholarCross Ref
- Sebastian Bach, HHI FRAUNHOFER, Alexander Binder, EDU SG, and Wojciech Samek . {n. d.}. Deep Taylor Decomposition of Neural Networks. (. {n. d.}).Google Scholar
- MR Bishop, RM Dean, SM Steinberg, J Odom, SZ Pavletic, C Chow, S Pittaluga, Claude Sportes, NM Hardy, J Gea-Banacloche, et almbox. . 2008. Clinical evidence of a graft-versus-lymphoma effect against relapsed diffuse large B-cell lymphoma after allogeneic hematopoietic stem-cell transplantation. Annals of oncology Vol. 19, 11 (2008), 1935--1940.Google ScholarCross Ref
- WM Creyghton, EH Danen, GP Luyten, MJ Jager, et almbox. . 1995. Cytokine-mediated modulation of integrin, ICAM-1 and CD44 expression on human uveal melanoma cells in vitro. Melanoma research Vol. 5, 4 (1995), 235--242.Google Scholar
- Padideh Danaee, Reza Ghaeini, and David A Hendrix . 2017. A deep learning approach for cancer detection and relevant gene identification PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017. World Scientific, 219--229.Google Scholar
- Panagiota Economopoulou, Ioannis Kotsantis, and Amanda Psyrri . 2016. The promise of immunotherapy in head and neck squamous cell carcinoma: combinatorial immunotherapy approaches. ESMO open Vol. 1, 6 (2016), e000122.Google Scholar
- Mildred Felder, Arvinder Kapur, Jesus Gonzalez-Bosquet, Sachi Horibata, Joseph Heintz, Ralph Albrecht, Lucas Fass, Justanjyot Kaur, Kevin Hu, Hadi Shojaei, et almbox. . 2014. MUC16 (CA125): tumor biomarker to cancer therapy, a work in progress. Molecular cancer Vol. 13, 1 (2014), 129.Google Scholar
- Todd M Gibson, Eric A Engels, Christina A Clarke, Charles F Lynch, Dennis D Weisenburger, and Lindsay M Morton . 2014. Risk of diffuse large B-cell lymphoma after solid organ transplantation in the United States. American journal of hematology Vol. 89, 7 (2014), 714--720.Google Scholar
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- Jie Hu, Li Shen, and Gang Sun . 2017. Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507 (2017).Google Scholar
- Yuanyuan Li, Kai Kang, Juno M Krahn, Nicole Croutwater, Kevin Lee, David M Umbach, and Leping Li . 2017. A comprehensive genomic pan-cancer classification using The Cancer Genome Atlas gene expression data. BMC genomics Vol. 18, 1 (2017), 508.Google Scholar
- Rahil Mashhadi, Gholamreza Pourmand, Farid Kosari, Abdolrasoul Mehrsai, Sepehr Salem, Mohammad Reza Pourmand, Sudabeh Alatab, Mehdi Khonsari, Fariba Heydari, Laleh Beladi, et almbox. . 2014. Role of steroid hormone receptors in formation and progression of bladder carcinoma: a case-control study. Urology journal Vol. 11, 6 (2014), 1968.Google Scholar
- Li Peng, Xiu Wu Bian, Chuan Xu, Guang Ming Wang, Qing You Xia, Qing Xiong, et almbox. . 2015. Large-scale RNA-Seq transcriptome analysis of 4043 cancers and 548 normal tissue controls across 12 TCGA cancer types. Scientific reports Vol. 5 (2015), 13413.Google Scholar
- Mark Schiffman, Philip E Castle, Jose Jeronimo, Ana C Rodriguez, and Sholom Wacholder . 2007. Human papillomavirus and cervical cancer. The Lancet Vol. 370, 9590 (2007), 890--907.Google Scholar
- Yoshitaka Sekido . 2013. Molecular pathogenesis of malignant mesothelioma. Carcinogenesis Vol. 34, 7 (2013), 1413--1419.Google ScholarCross Ref
- Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra . 2016. Grad-cam: Visual explanations from deep networks via gradient-based localization. See https://arxiv. org/abs/1610.02391 v3 Vol. 7, 8 (2016).Google Scholar
- Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, et almbox. . 2015. Going deeper with convolutions. Cvpr.Google Scholar
- Philip E Tarr, Michael C Sneller, Laura J Mechanic, Athena Economides, Christopher M Eger, Warren Strober, Charlotte Cunningham-rundles, and Daniel R Lucey . 2001. Infections in patients with immunodeficiency with thymoma (Good syndrome): report of 5 cases and review of the literature. Medicine Vol. 80, 2 (2001), 123--133.Google ScholarCross Ref
- Chin-Hsiao Tseng and Farn-Hsuan Tseng . 2014. Diabetes and gastric cancer: the potential links. World journal of gastroenterology: WJG Vol. 20, 7 (2014), 1701.Google ScholarCross Ref
- Gisele Moledo de Vasconcelos, Fernanda Azevedo-Silva, Luiz Claudio dos Santos Thuler, Eugênia Terra Granado Pina, Celeste SF Souza, Katia Calabrese, and Maria S Pombo-de Oliveira . 2014. The concurrent occurrence of Leishmania chagasi infection and childhood acute leukemia in Brazil. Revista brasileira de hematologia e hemoterapia Vol. 36, 5 (2014), 356--362.Google Scholar
Index Terms
- Deep Learning Based Tumor Type Classification Using Gene Expression Data
Recommendations
Deep learning-based classification and interpretation of gene expression data from cancer and normal tissues
Outstanding performance has been achieved in resolving recognition and classification problems with deep learning technology. As increasing amounts of gene expression data from cancer and normal samples become publicly available, deep learning may become ...
Cancer classification using gene expression data
Special issue: Data management in bioinformaticsThe classification of different tumor types is of great importance in cancer diagnosis and drug discovery. However, most previous cancer classification studies are clinical based and have limited diagnostic ability. Cancer classification using gene ...
Identifying time-lagged gene clusters using gene expression data
Motivation: Analysis of gene expression data can provide insights into the time-lagged co-regulation of genes/gene clusters. However, existing methods such as the Event Method and the Edge Detection Method are inefficient as they compare only two ...
Comments