research-article

Deep Learning Based Tumor Type Classification Using Gene Expression Data

Authors:
Boyu Lyu

Virginia Tech, Blacksburg, VA, USA

Virginia Tech, Blacksburg, VA, USA
View Profile

,
Anamul Haque

Virginia Tech, Blacksburg, VA, USA

Virginia Tech, Blacksburg, VA, USA
View Profile

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health InformaticsAugust 2018Pages 89–96https://doi.org/10.1145/3233547.3233588

Published:15 August 2018Publication History

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

Pages 89–96

ABSTRACT

The differential analysis is the most significant part of RNA-Seq analysis. Conventional methods of the differential analysis usually match the tumor samples to the normal samples, which are both from the same tumor type. Such method would fail in differentiating tumor types because it lacks the knowledge from other tumor types. The Pan-Cancer Atlas provides us with abundant information on 33 prevalent tumor types which could be used as prior knowledge to generate tumor-specific biomarkers. In this paper, we embedded the high dimensional RNA-Seq data into 2-D images and used a convolutional neural network to make classification of the 33 tumor types. The final accuracy we got was 95.59%. Furthermore, based on the idea of Guided Grad Cam, as to each class, we generated significance heat-map for all the genes. By doing functional analysis on the genes with high intensities in the heat-maps, we validated that these top genes are related to tumor-specific pathways, and some of them have already been used as biomarkers, which proved the effectiveness of our method. As far as we know, we are the first to apply a convolutional neural network on Pan-Cancer Atlas for the classification of tumor types, and we are also the first to use gene's contribution in classification to the importance of genes to identify candidate biomarkers. Our experiment results show that our method has a good performance and could also apply to other genomics data.

References

Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek . 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one Vol. 10, 7 (2015), e0130140.Google ScholarCross Ref
Sebastian Bach, HHI FRAUNHOFER, Alexander Binder, EDU SG, and Wojciech Samek . {n. d.}. Deep Taylor Decomposition of Neural Networks. (. {n. d.}).Google Scholar
MR Bishop, RM Dean, SM Steinberg, J Odom, SZ Pavletic, C Chow, S Pittaluga, Claude Sportes, NM Hardy, J Gea-Banacloche, et almbox. . 2008. Clinical evidence of a graft-versus-lymphoma effect against relapsed diffuse large B-cell lymphoma after allogeneic hematopoietic stem-cell transplantation. Annals of oncology Vol. 19, 11 (2008), 1935--1940.Google ScholarCross Ref
WM Creyghton, EH Danen, GP Luyten, MJ Jager, et almbox. . 1995. Cytokine-mediated modulation of integrin, ICAM-1 and CD44 expression on human uveal melanoma cells in vitro. Melanoma research Vol. 5, 4 (1995), 235--242.Google Scholar
Padideh Danaee, Reza Ghaeini, and David A Hendrix . 2017. A deep learning approach for cancer detection and relevant gene identification PACIFIC SYMPOSIUM ON BIOCOMPUTING 2017. World Scientific, 219--229.Google Scholar
Panagiota Economopoulou, Ioannis Kotsantis, and Amanda Psyrri . 2016. The promise of immunotherapy in head and neck squamous cell carcinoma: combinatorial immunotherapy approaches. ESMO open Vol. 1, 6 (2016), e000122.Google Scholar
Mildred Felder, Arvinder Kapur, Jesus Gonzalez-Bosquet, Sachi Horibata, Joseph Heintz, Ralph Albrecht, Lucas Fass, Justanjyot Kaur, Kevin Hu, Hadi Shojaei, et almbox. . 2014. MUC16 (CA125): tumor biomarker to cancer therapy, a work in progress. Molecular cancer Vol. 13, 1 (2014), 129.Google Scholar
Todd M Gibson, Eric A Engels, Christina A Clarke, Charles F Lynch, Dennis D Weisenburger, and Lindsay M Morton . 2014. Risk of diffuse large B-cell lymphoma after solid organ transplantation in the United States. American journal of hematology Vol. 89, 7 (2014), 714--720.Google Scholar
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun . 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
Jie Hu, Li Shen, and Gang Sun . 2017. Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507 (2017).Google Scholar
Yuanyuan Li, Kai Kang, Juno M Krahn, Nicole Croutwater, Kevin Lee, David M Umbach, and Leping Li . 2017. A comprehensive genomic pan-cancer classification using The Cancer Genome Atlas gene expression data. BMC genomics Vol. 18, 1 (2017), 508.Google Scholar
Rahil Mashhadi, Gholamreza Pourmand, Farid Kosari, Abdolrasoul Mehrsai, Sepehr Salem, Mohammad Reza Pourmand, Sudabeh Alatab, Mehdi Khonsari, Fariba Heydari, Laleh Beladi, et almbox. . 2014. Role of steroid hormone receptors in formation and progression of bladder carcinoma: a case-control study. Urology journal Vol. 11, 6 (2014), 1968.Google Scholar
Li Peng, Xiu Wu Bian, Chuan Xu, Guang Ming Wang, Qing You Xia, Qing Xiong, et almbox. . 2015. Large-scale RNA-Seq transcriptome analysis of 4043 cancers and 548 normal tissue controls across 12 TCGA cancer types. Scientific reports Vol. 5 (2015), 13413.Google Scholar
Mark Schiffman, Philip E Castle, Jose Jeronimo, Ana C Rodriguez, and Sholom Wacholder . 2007. Human papillomavirus and cervical cancer. The Lancet Vol. 370, 9590 (2007), 890--907.Google Scholar
Yoshitaka Sekido . 2013. Molecular pathogenesis of malignant mesothelioma. Carcinogenesis Vol. 34, 7 (2013), 1413--1419.Google ScholarCross Ref
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra . 2016. Grad-cam: Visual explanations from deep networks via gradient-based localization. See https://arxiv. org/abs/1610.02391 v3 Vol. 7, 8 (2016).Google Scholar
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, et almbox. . 2015. Going deeper with convolutions. Cvpr.Google Scholar
Philip E Tarr, Michael C Sneller, Laura J Mechanic, Athena Economides, Christopher M Eger, Warren Strober, Charlotte Cunningham-rundles, and Daniel R Lucey . 2001. Infections in patients with immunodeficiency with thymoma (Good syndrome): report of 5 cases and review of the literature. Medicine Vol. 80, 2 (2001), 123--133.Google ScholarCross Ref
Chin-Hsiao Tseng and Farn-Hsuan Tseng . 2014. Diabetes and gastric cancer: the potential links. World journal of gastroenterology: WJG Vol. 20, 7 (2014), 1701.Google ScholarCross Ref
Gisele Moledo de Vasconcelos, Fernanda Azevedo-Silva, Luiz Claudio dos Santos Thuler, Eugênia Terra Granado Pina, Celeste SF Souza, Katia Calabrese, and Maria S Pombo-de Oliveira . 2014. The concurrent occurrence of Leishmania chagasi infection and childhood acute leukemia in Brazil. Revista brasileira de hematologia e hemoterapia Vol. 36, 5 (2014), 356--362.Google Scholar

Index Terms

Deep Learning Based Tumor Type Classification Using Gene Expression Data
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Deep learning-based classification and interpretation of gene expression data from cancer and normal tissues

Outstanding performance has been achieved in resolving recognition and classification problems with deep learning technology. As increasing amounts of gene expression data from cancer and normal samples become publicly available, deep learning may become ...
Read More
Cancer classification using gene expression data
Special issue: Data management in bioinformatics

The classification of different tumor types is of great importance in cancer diagnosis and drug discovery. However, most previous cancer classification studies are clinical based and have limited diagnostic ability. Cancer classification using gene ...
Read More
Identifying time-lagged gene clusters using gene expression data

Motivation: Analysis of gene expression data can provide insights into the time-lagged co-regulation of genes/gene clusters. However, existing methods such as the Event Method and the Edge Detection Method are inefficient as they compare only two ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics
August 2018
727 pages
ISBN:9781450357944
DOI:10.1145/3233547
General Chairs:
Amarda Shehu
George Mason University, USA
,
Cathy Wu
University of Delaware, USA
,
Program Chairs:
Christina Boucher
University of Florida, USA
,
Jing Li
Case Western Reserve University, USA
,
Hongfang Liu
Mayo Clinic, USA
,
Mihai Pop
University of Maryland, USA
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 August 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
convolutional neural network
deep learning
pan-cancer atlas
tumor type classification
Qualifiers
- research-article
Conference

Acceptance Rates
BCB '18 Paper Acceptance Rate46of148submissions,31%Overall Acceptance Rate254of885submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 70
  Total Citations
  View Citations
- 1,288
  Total Downloads
- Downloads (Last 12 months)207
- Downloads (Last 6 weeks)32
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Deep Learning Based Tumor Type Classification Using Gene Expression Data

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deep learning-based classification and interpretation of gene expression data from cancer and normal tissues

Cancer classification using gene expression data

Identifying time-lagged gene clusters using gene expression data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Deep Learning Based Tumor Type Classification Using Gene Expression Data

BCB '18: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Deep learning-based classification and interpretation of gene expression data from cancer and normal tissues

Cancer classification using gene expression data

Identifying time-lagged gene clusters using gene expression data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media