MDI-GPU: accelerating integrative modelling for genomic-scale data using GP-GPU computing

Samuel A. Mason; Faiz Sayyid; Paul D.W. Kirk; Colin Starr; David L. Wild

doi:10.1515/sagmb-2015-0055

Published by De Gruyter February 24, 2016

MDI-GPU: accelerating integrative modelling for genomic-scale data using GP-GPU computing

Samuel A. Mason , Faiz Sayyid , Paul D.W. Kirk , Colin Starr and David L. Wild

From the journal Statistical Applications in Genetics and Molecular Biology

https://doi.org/10.1515/sagmb-2015-0055

Showing a limited preview of this publication:

Abstract

The integration of multi-dimensional datasets remains a key challenge in systems biology and genomic medicine. Modern high-throughput technologies generate a broad array of different data types, providing distinct – but often complementary – information. However, the large amount of data adds burden to any inference task. Flexible Bayesian methods may reduce the necessity for strong modelling assumptions, but can also increase the computational burden. We present an improved implementation of a Bayesian correlated clustering algorithm, that permits integrated clustering to be routinely performed across multiple datasets, each with tens of thousands of items. By exploiting GPU based computation, we are able to improve runtime performance of the algorithm by almost four orders of magnitude. This permits analysis across genomic-scale data sets, greatly expanding the range of applications over those originally possible. MDI is available here: http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/.

Keywords: Bayesian; clustering; GPU

Corresponding author: David L. Wild, Systems Biology Centre, University of Warwick, Coventry, CV4 7AL, UK, e-mail: D.L.Wild@warwick.ac.uk

References

Barash, Y. and N. Friedman (2002): “Context-specific Bayesian clustering for gene expression data,” J. Comput. Biol., 9, 169–191.Search in Google Scholar

Kirk, P., J. E. Griffin, R. S. Savage, Z. Ghahramani and D. L. Wild (2012): “Bayesian correlated clustering to integrate multiple datasets,” Bioinformatics, 28, 3290–3297.10.1093/bioinformatics/bts595Search in Google Scholar PubMed PubMed Central

Liu, X., S. Sivaganesan, K. Y. Yeung, J. Guo, R. E. Bumgarner and M. Medvedovic (2006): “Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset,” Bioinformatics, 22, 1737–1744.10.1093/bioinformatics/btl184Search in Google Scholar PubMed PubMed Central

Liu, X., W. J. Jessen, S. Sivaganesan, B. J. Aronow and M. Medvedovic (2007): “Bayesian hierarchical model for transcriptional module discovery by jointly modeling gene expression and ChIP-chip data,” BMC Bioinformatics, 8, 283.10.1186/1471-2105-8-283Search in Google Scholar PubMed PubMed Central

Nvidia (2013): “Compute Unified Device Architecture,” URL http://docs.nvidia.com/cuda/cuda-c-programming-guide/.Search in Google Scholar

Rogers, S., A. Klami, J. Sinkkonen, M. Girolami and S. Kaski (2009): “Infinite factorization of multiple non-parametric views,” Mach. Learn., 79, 201–226.Search in Google Scholar

Savage, R. S., Z. Ghahramani, J. E. Griffin, B. J. de la Cruz and D. L. Wild (2010): “Discovering transcriptional modules by Bayesian data integration,” Bioinformatics, 26, i158–i167.10.1093/bioinformatics/btq210Search in Google Scholar PubMed PubMed Central

Savage, R. S., Z. Ghahramani, J. E. Griffin, P. Kirk and D. L. Wild (2013): “Identifying cancer subtypes in glioblastoma by combining genomic, transcriptomic and epigenomic data,” in International Conference on Machine Learning.Search in Google Scholar

Suchard, M. A., Q. Wang, C. Chan, J. Frelinger, A. Cron and M. West (2010): “Understanding GPU programming for statistical computation: studies in massively parallel massive mixtures,” J. Comput. Graph. Stat., 19, 419–438.Search in Google Scholar

Yuan, Y., R. S. Savage and F. Markowetz (2011): “Patient-specific data fusion defines prognostic cancer subtypes,” PLoS Comput. Biol. 7, e1002227.Search in Google Scholar

Supplemental Material:

The online version of this article (DOI: 10.1515/sagmb-2015-0055) offers supplementary material, available to authorized users.

Published Online: 2016-2-24

Published in Print: 2016-3-1

MDI-GPU: accelerating integrative modelling for genomic-scale data using GP-GPU computing

Abstract

References

Supplemental Material:

Journal and Issue

Articles in the same Issue