skip to main content
10.1145/2506583.2506605acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
tutorial

Cloud4SNP: Distributed Analysis of SNP Microarray Data on the Cloud

Published:22 September 2013Publication History

ABSTRACT

Pharmacogenomics studies the impact of genetic variation of patients on drug responses and searches for correlations between gene expression or Single Nucleotide Polymorphisms (SNPs) of patient's genome and the toxicity or efficacy of a drug. SNPs data, produced by microarray platforms, need to be preprocessed and analyzed in order to find correlation between the presence/absence of SNPs and the toxicity or efficacy of a drug. Due to the large number of samples and the high resolution of instruments, the data to be analyzed can be very huge, requiring high performance computing. The paper presents the design and experimentation of Cloud4SNP, a novel Cloud-based bioinformatics tool for the parallel preprocessing and statistical analysis of pharmacogenomics SNP microarray data. Experimental evaluation shows good speed-up and scalability. Moreover, the availability on the Cloud platform allows to face in an elastic way the requirements of small as well as very large pharmacogenomics studies.

References

  1. G. Barton, J. Abbott, N. Chiba, D. Huang, Y. Huang, M. Krznaric, J. Mack-Smith, A. Saleem, B. Sherman, B. Tiwari, C. Tomlinson, T. Aitman, J. Darlington, L. Game, M. Sternberg, and S. Butcher. Emaas: An extensible grid-based rich internet application for microarray data analysis and management. BMC Bioinformatics, 9(1):493, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  2. J. K. Burmester, M. Sedova, M. H. Shapero, and E. Mansfield. Dmet microarray technology for pharmacogenomics-based personalized medicine. Microarray Methods for Drug Discovery, Methods in Molecular Biology, 632:99--124, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  3. M. Cannataro, P. H. Guzzi, and P. Veltri. Protein-to-protein interactions: Technologies, databases, and algorithms. ACM Comput. Surv., 43(1):1:1--1:36, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E. Cesario, M. Lackovic, D. Talia, and P. Trunfio. Programming knowledge discovery workflows in service-oriented distributed systems. Concurrency and Computation: Practice and Experience, 25(10):1482--1504, July 2013.Google ScholarGoogle ScholarCross RefCross Ref
  5. D. Churches, G. Gombás, A. Harrison, J. Maassen, C. Robinson, M. S. Shields, I. J. Taylor, and I. Wang. Programming scientific and distributed workflow with Triana services. Concurrency and Computation: Practice and Experience, 18(10):1021--1037, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. E. Deelman, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, S. Patil, M.-H. Su, K. Vahi, and M. Livny. Pegasus: Mapping Scientific Workflows onto the Grid. In M. Dikaiakos, editor, Grid Computing, volume 3165 of Lecture Notes in Computer Science, chapter 2, pages 131--140. Springer Berlin / Heidelberg, 2004.Google ScholarGoogle Scholar
  7. M. T. Di Martino, M. Arbitrio, P. H. Guzzi, E. Leone, F. Baudi, E. Piro, T. Prantera, I. Cucinotto, T. Calimeri, M. Rossi, P. Veltri, M. Cannataro, P. Tagliaferri, and P. Tassone. A peroxisome proliferator-activated receptor gamma (pparg) polymorphism is associated with zoledronic acid-related osteonecrosis of the jaw in multiple myeloma patients: analysis by dmet microarray profiling. British Journal of Haematology, pages 529--533, 2011.Google ScholarGoogle Scholar
  8. M. T. Di Martino, M. Arbitrio, E. Leone, P. H. Guzzi, M. Saveria Rotundo, D. Ciliberto, V. Tomaino, F. Fabiani, D. Talarico, P. Sperlongano, P. Doldo, M. Cannataro, M. Caraglia, P. Tassone, and P. Tagliaferri. Single nucleotide polymorphisms of ABCC5 and ABCG1 transporter genes correlate to irinotecan-associated gastrointestinal toxicity in colorectal cancer patients: A DMET microarray profiling study. Cancer Biology and Therapy, 12(9):780--787, November 1 2011.Google ScholarGoogle ScholarCross RefCross Ref
  9. J. Goecks, A. Nekrutenko, J. Taylor, and T. G. Team. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology, 11(8):R86+, Aug. 2010.Google ScholarGoogle Scholar
  10. P. H. Guzzi, G. Agapito, M. T. Di Martino, M. Arbitrio, P. Tagliaferrri, P. Tassone, and M. Cannataro. DMET-analyzer: automatic analysis of affymetrix DMET data. BMC Bioinformatics, 13:258:258+, Oct. 2012.Google ScholarGoogle ScholarCross RefCross Ref
  11. P. H. Guzzi and M. Cannataro. mu-cs: An extension of the tm4 platform to manage affymetrix binary data. BMC Bioinformatics, 11:315, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  12. D. Hull, K. Wolstencroft, R. Stevens, C. Goble, M. R. Pocock, P. Li, and T. Oinn. Taverna: a tool for building and running workflows of services. Nucleic Acids Research, 34(suppl 2):729--732, July 2006.Google ScholarGoogle ScholarCross RefCross Ref
  13. B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, E. Jaeger, M. Jones, E. A. Lee, J. Tao, and Y. Zhao. Scientific workflow management and the kepler system: Research articles. Concurr. Comput.: Pract. Exper., 18(10):1039--1065, Aug. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. F. Marozzo, D. Talia, and P. Trunfio. A cloud framework for parameter sweeping data mining applications. In Proc. of the 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2011), pages 367--374, Athens, Greece, 1 December 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. F. Marozzo, D. Talia, and P. Trunfio. A cloud framework for big data analytics workflows on azure. In Proc. of the 2012 High Performance Computing Workshop, HPC 2012. 2012.Google ScholarGoogle Scholar
  16. F. Marozzo, D. Talia, and P. Trunfio. Using clouds for scalable knowledge discovery applications. In Euro-Par Workshops, pages 220--227, Rhodes Island, Greece, August 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C. Phillips. SNP Databases. In A. A. Komar, editor, Single Nucleotide Polymorphisms, volume 578, chapter 3, pages 43--71. Humana Press, Totowa, NJ, 2009.Google ScholarGoogle Scholar
  18. M. U. Schmidberger M, Vicedo E. affypara-a bioconductor package for parallelized preprocessing algorithms of affymetrix microarray data. Bioinform Biol Insights, 30(22):83--7, 2009.Google ScholarGoogle Scholar
  19. D. Talia and P. Trunfio. How distributed data mining tasks can thrive as knowledge services. Communications of the ACM, 53(7):132--137, July 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Cloud4SNP: Distributed Analysis of SNP Microarray Data on the Cloud

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        BCB'13: Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
        September 2013
        987 pages
        ISBN:9781450324342
        DOI:10.1145/2506583

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 22 September 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • tutorial
        • Research
        • Refereed limited

        Acceptance Rates

        BCB'13 Paper Acceptance Rate43of148submissions,29%Overall Acceptance Rate254of885submissions,29%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader