skip to main content
survey

Scientific Workflows: Moving Across Paradigms

Published:12 December 2016Publication History
Skip Abstract Section

Abstract

Modern scientific collaborations have opened up the opportunity to solve complex problems that require both multidisciplinary expertise and large-scale computational experiments. These experiments typically consist of a sequence of processing steps that need to be executed on selected computing platforms. Execution poses a challenge, however, due to (1) the complexity and diversity of applications, (2) the diversity of analysis goals, (3) the heterogeneity of computing platforms, and (4) the volume and distribution of data.

A common strategy to make these in silico experiments more manageable is to model them as workflows and to use a workflow management system to organize their execution. This article looks at the overall challenge posed by a new order of scientific experiments and the systems they need to be run on, and examines how this challenge can be addressed by workflows and workflow management systems. It proposes a taxonomy of workflow management system (WMS) characteristics, including aspects previously overlooked. This frames a review of prevalent WMSs used by the scientific community, elucidates their evolution to handle the challenges arising with the emergence of the “fourth paradigm,” and identifies research needed to maintain progress in this area.

References

  1. B. P. Abbott, R. Abbott, T. D. Abbott, M. R. Abernathy, F. Acernese, K. Ackley, C. Adams, T. Adams, P. Addesso, and others. 2016. Observation of gravitational waves from a binary Black Hole merger. Phys. Rev. Lett. 116, 6 (Feb. 2016), 061102.Google ScholarGoogle Scholar
  2. Mohamed Abouelhoda, Shadi Issa, and Moustafa Ghanem. 2012. Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support. BMC Bioinformatics 13, 1 (2012), 77.Google ScholarGoogle ScholarCross RefCross Ref
  3. David Abramson, Colin Enticott, and Ilkay Altinas. 2008. Nimrod/K: Towards massively parallel dynamic grid workflows. In Proc. ACM/IEEE Conference on Supercomputing (SC’08). IEEE Press, Piscataway, NJ, USA, Article 24, 11 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bernie Ács, Xavier Llorà, Loretta Auvil, Boris Capitanu, David Tcheng, Mike Haberman, Limin Dong, Tim Wentling, and Michael Welge. 2010. A general approach to data-intensive computing using the Meandre component-based framework. In Proc. 1st International Workshop on Workflow Approaches to New Data-centric Science (WANDS’10). ACM, Article 8, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Aashish N. Adhikari, Jian Peng, Michael Wilde, Jinbo Xu, Karl F. Freed, and Tobin R. Sosnick. 2012. Modeling large regions in proteins: Applications to loops, termini, and folding. Protein Science 21, 1 (Jan. 2012), 107--121.Google ScholarGoogle ScholarCross RefCross Ref
  6. Chris Allan, Jean-Marie Burel, Josh Moore, Colin Blackburn, Melissa Linkert, Scott Loynton, Donald MacDonald, William J Moore, Carlos Neves, and others. 2012. OMERO: Flexible, model-driven data management for experimental biology. Nature Methods 9, 3 (March 2012), 245--253.Google ScholarGoogle ScholarCross RefCross Ref
  7. Ilkay Altintas, Oscar Barney, and Efrat Jaeger-Frank. 2006. Provenance collection support in the Kepler scientific workflow system. In Provenance and Annotation of Data. LNCS, Vol. 4145. 118--132. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. 2010. A view of cloud computing. Commun. ACM 53, 4 (April 2010), 50--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Malcolm P. Atkinson. 2013. Data-Intensive thinking with Dispel. In The Data Bonanza -- Improving Knowledge Discovery for Science, Engineering and Business, Malcolm P. Atkinson, Rob Baxter, Paolo Besana, Michelle Galea, Mark Parsons, Peter Brezany, Oscar Corcho, Jano van Hemert, and David Snelling (Eds.). John Wiley 8 Sons, Inc., Hoboken, NJ, USA, Chapter 4, 61--122.Google ScholarGoogle Scholar
  10. Malcolm P. Atkinson, Michele Carpené, Emanuele Casarotti, Steffen Claus, Rosa Filgueira, Anton Frank, Michelle Galea, Tom Garth, André Gemünd, and others. 2015. VERCE delivers a productive e-Science environment for seismology research. In Proc. IEEE International Conference on e-Science (e-Science 2015). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Malcolm P. Atkinson and Mark Parsons. 2013. The Digital-Data Challenge. In The Data Bonanza-- Improving Knowledge Discovery for Science, Engineering and Business, Malcolm P. Atkinson, Rob Baxter, Paolo Besana, Michelle Galea, Mark Parsons, Peter Brezany, Oscar Corcho, Jano van Hemert, and David Snelling (Eds.). John Wiley 8 Sons, Inc., Hoboken, NJ, USA, Chapter 1, 5--13.Google ScholarGoogle Scholar
  12. Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, and Jennifer Widom. 2002. Models and issues in data stream systems. In Proc. 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’02). ACM, New York, NY, USA, 1--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Roger Barga, Jared Jackson, Nelson Araujo, Dean Guo, Nitin Gautam, and Yogesh Simmhan. 2008. The trident scientific workflow workbench. In Proc. e-Science’08. IEEE Computer Society, Los Alamitos, CA, USA, 317--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Adam Barker, Christopher D. Walton, and David Robertson. 2009. Choreographing web services. IEEE Trans. on Services Computing 2, 2 (April-June 2009), 152--166. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Adam Barker, Jon B. Weissman, and Jano van Hemert. 2008. Orchestrating data-centric workflows. In Proc. 8th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID 2008). IEEE Computer Society, 210--217. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jörg Becker, Michael zur Muehlen, and Marc Gille. 2002. Workflow application architectures: Classification and characteristics of workflow-based information systems. In Workflow Handbook 2002, Layna Fischer (Ed.). Future Strategies, 39--50.Google ScholarGoogle Scholar
  17. Stephan Beisken, Thorsten Meinl, Bernd Wiswedel, Luis de Figueiredo, Michael Berthold, and Christoph Steinbeck. 2013. KNIME-CDK: Workflow-driven cheminformatics. BMC Bioinformatics 14, 1 (2013), 257.Google ScholarGoogle ScholarCross RefCross Ref
  18. Khalid Belhajjame, Jun Zhao, Daniel Garijo, Kristina Hettne, Raul Palma, Óscar Corcho, José-Manuel Gómez-Pérez, Sean Bechhofer, Graham Klyne, and Carole Goble. 2015. Using a suite of ontologies for preserving workflow-centric research objects. Web Semantics: Science, Services and Agents on the World Wide Web 32 (2015), 16--42. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. G. Bruce Berriman, Ewa Deelman, Paul T. Groth, and Gideon Juve. 2010. The application of cloud computing to the creation of image mosaics and management of their provenance. In Software and Cyberinfrastructure for Astronomy, Nicole M. Radziwill and Alan Bridger (Eds.), Vol. 7740. SPIE, 77401F.Google ScholarGoogle Scholar
  20. Michael R. Berthold, Nicolas Cebron, Fabian Dill, Thomas R. Gabriel, Tobias Kötter, Thorsten Meinl, Peter Ohl, Kilian Thiel, and Bernd Wiswedel. 2009. KNIME - The Konstanz information miner. SIGKDD Explorations 11, 1 (Nov. 2009), 26--31. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shishir Bharathi, Ann Chervenak, Ewa Deelman, Gaurang Mehta, Mei-Hui Su, and Karan Vahi. 2008. Characterization of scientific workflows. In Proc. Workflows for Science (WORKS’08). IEEE Computer Society, 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  22. Daniel Blankenberg, Gregory Von Kuster, Nathaniel Coraor, Guruprasad Ananda, Ross Lazarus, Mary Mangan, Anton Nekrutenko, and James Taylor. 2010. Galaxy: A Web-Based Genome Analysis Tool for Experimentalists. John Wiley 8 Sons, Inc.Google ScholarGoogle Scholar
  23. Peter A. Boncz, Martin L. Kersten, and Stefan Manegold. 2008. Breaking the memory wall in MonetDB. Commun. ACM 51, 12 (Dec. 2008), 77--85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Shawn Bowers and Bertram Ludäscher. 2005. Actor-oriented design of scientific workflows. In Conceptual Modeling -- ER 2005. LNCS, Vol. 3716. 369--384. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Shawn Bowers, Timothy McPhillips, Martin Wu, and Bertram Ludäscher. 2007. Project histories: Managing data provenance across collection-oriented scientific workflow runs. In Data Integration in the Life Sciences. LNCS, Vol. 4544. 122--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. P. Chris Broekema, Rob V. van Nieuwpoort, and Henri E. Bal. 2012. ExaScale high performance computing in the square kilometer array. In Proc. Astro-HPC’12. ACM, New York, NY, USA, 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Christopher Brooks, Edward A. Lee, Xiaojun Liu, Stephen Neuendorffer, Yang Zhao, and Haiyang Zheng. 2007. Heterogeneous Concurrent Modeling and Design in Java (Volume 1: Introduction to Ptolemy II). Technical Report UCB/EECS-2007-7. EECS Department, University of California, Berkeley. http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-7.html.Google ScholarGoogle Scholar
  28. Erik Brynjolfsson, Paul Hofmann, and John Jordan. 2010. Cloud computing and electricity: Beyond the utility model. Commun. ACM 53, 5 (May 2010), 32--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Tamás Budavári, László Dobos, and Alexander S. Szalay. 2013. SkyQuery: Federating astronomy archives. Computing in Science 8 Engineering 15, 3 (2013), 12--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Carlos Buil-Aranda, Marcelo Arenas, Oscar Corcho, and Axel Polleres. 2013. Federating queries in {SPARQL} 1.1: Syntax, semantics and evaluation. Web Semantics: Science, Services and Agents on the World Wide Web 18, 1 (2013), 1--17. Special Section on the Semantic and Social Web Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jacek Cała, Eyad Marei, Yaobo Xu, Kenji Takeda, and Paolo Missier. 2016. Scalable and efficient whole-exome data processing using workflows on the cloud. Future Gener. Comput. Syst. 65 (2016), 153--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Scott Callaghan, Ewa Deelman, Dan Gunter, Gideon Juve, Philip Maechling, Christopher Brooks, Karan Vahi, Kevin Milner, Robert Graves, Edward Field, David Okaya, and Thomas Jordan. 2010. Scaling up workflow-based applications. J. Comput. System Sci. 76, 6 (2010), 428--446. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Steven P. Callahan, Juliana Freire, Emanuele Santos, Carlos E. Scheidegger, Cláudio T. Silva, and Huy T. Vo. 2006. Managing the evolution of dataflows with VisTrails. In Proc. 22nd International Conference on Data Engineering Workshops (ICDEW’06). IEEE Computer Society, Washington, DC, USA, 71. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Sashi Kiran Challa, Marlon Pierce, and Suresh Marru. 2010. Integrating chemistry scholarship with web architectures, grid computing and semantic web. In Proc. Gateway Computing Environments Workshop (GCE’10). 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  35. Matthew Chalmers. 2014. Large Hadron Collider: The big reboot. Nature 514 (2014), 158--160.Google ScholarGoogle ScholarCross RefCross Ref
  36. Jinjun Chen and Yun Yang. 2008. A taxonomy of grid workflow verification and validation. Concurrency and Computation: Practice and Experience 20, 4 (March 2008), 347--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Weiwei Chen, Rafael Ferreira da Silva, Ewa Deelman, and Rizos Sakellariou. 2015. Using imbalance metrics to optimize task clustering in scientific workflow executions. Future Gener. Comput. Syst. 46 (2015), 69--84. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Weiwei Chen and Ewa Deelman. 2011. Workflow overhead analysis and optimizations. In Proc. WORKS’11. ACM, New York, NY, USA, 11--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Daniel Crawl and Ilkay Altintas. 2008. A provenance-based fault tolerance mechanism for scientific workflows. In Provenance and Annotation of Data and Processes. LNCS, Vol. 5272. 152--159. 10.1007/978-3-540-89965-5_17 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Víctor Cuevas-Vicenttín, Saumen Dey, Sven Köhler, Sean Riddle, and Bertram Ludäscher. 2012. Scientific workflows and provenance: Introduction and research opportunities. Datenbank-Spektrum 12, 3 (2012), 193--203.Google ScholarGoogle ScholarCross RefCross Ref
  41. Sérgio Manuel Serra da Cruz, Maria Luiza M. Campos, and Marta Mattoso. 2009. Towards a taxonomy of provenance in scientific workflow management systems. In Proc. 2009 IEEE Congress on Services- PartI (SERVICES’09). IEEE Computer Society, 259--266. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. David De Roure, Carole Goble, Sergejs Aleksejevs, Sean Bechhofer, Jiten Bhagat, Don Cruickshank, Paul Fisher, Nandkumar Kollara, Danius Michaelides, and others. 2010. The evolution of myExperiment. In Proc. e-Science’10. IEEE, 153--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. David De Roure, Carole Goble, and Robert Stevens. 2009. The design and realisation of the myExperiment virtual research environment for social sharing of workflows. Future Gener. Comput. Syst. 25, 5 (2009), 561--567. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. David De Roure, Kevin R. Page, Benjamin Fields, Tim Crawford, J. Stephen Downie, and Ichiro Fujinaga. 2011. An e-research approach to web-scale music analysis. Phil. Trans. R. Soc. A 369, 1949 (Aug. 2011), 3300--3317.Google ScholarGoogle ScholarCross RefCross Ref
  45. Ewa Deelman. 2010. Grids and clouds: Making workflow applications work in heterogeneous distributed environments. International Journal of High Performance Computing Applications 24, 3 (Aug. 2010), 284--298. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Ewa Deelman, Scott Callaghan, Edward Field, Hunter Francoeur, Robert Graves, Nitin Gupta, Vipin Gupta, Thomas H. Jordan, Carl Kesselman, and others. 2006. Managing large-scale workflow execution from resource provisioning to provenance tracking: The cybershake example. In Proc. e-Science’06. 14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Ewa Deelman, Dennis Gannon, Matthew Shields, and Ian Taylor. 2009. Workflows and e-Science: An overview of workflow system features and capabilities. Future Gener. Comput. Syst. 25, 5 (May 2009), 528--540. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Ewa Deelman, Karan Vahi, Gideon Juve, Mats Rynge, Scott Callaghan, Philip J. Maechling, Rajiv Mayani, Weiwei Chen, Rafael Ferreira da Silva, Miron Livny, and Kent Wenger. 2015. Pegasus, a workflow management system for science automation. Future Gener. Comput. Syst. 46 (2015), 17--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Ewa Deelman, Karan Vahi, Mats Rynge, Gideon Juve, Rajiv Mayani, and Rafael Ferreira da Silva. 2016. Pegasus in the cloud: Science automation through workflow technologies. IEEE Internet Computing 20, 1 (Jan. 2016), 70--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. László Dobos, István Csabai, Alexander S. Szalay, Tamás Budavári, and Nolan Li. 2013. Graywulf: A platform for federated scientific databases and services. In Proc. 25th International Conference on Scientific and Statistical Database Management (SSDBM). ACM, New York, NY, USA, Article 30, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Rion Dooley, Kent Milfeld, Chona Guiang, Sudhakar Pamidighantam, and Gabrielle Allen. 2006. From proposal to production: Lessons learned developing the computational chemistry grid cyberinfrastructure. Journal of Grid Computing 4, 2 (2006), 195--208.Google ScholarGoogle ScholarCross RefCross Ref
  52. Lei Dou, Daniel Zinn, Timothy McPhillips, Sven Kohler, Sean Riddle, Shawn Bowers, and Bertram Ludäscher. 2011. Scientific workflow design 2.0: Demonstrating streaming data collections in Kepler. In Proc. IEEE ICDE’11. 1296--1299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Johan Eker, Jörn W. Janneck, Edward A. Lee, Jie Liu, Xiaojun Liu, Jozsef Ludvig, Stephen Neuendorffer, Sonia Sachs, and Yuhong Xiong. 2003. Taming heterogeneity - the Ptolemy approach. Proc. IEEE 91, 1 (Jan. 2003), 127--144.Google ScholarGoogle ScholarCross RefCross Ref
  54. Erik Elmroth, Francisco Hernández, and Johan Tordsson. 2010. Three fundamental dimensions of scientific workflow interoperability: Model of computation, language, and execution environment. Future Gener. Comput. Syst. 26, 2 (Feb. 2010), 245--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Wolfgang Emmerich, Ben Butchart, Liang Chen, Bruno Wassermann, and Sarah Price. 2005. Grid service orchestration using the business process execution language (BPEL). Journal of Grid Computing 3, 3 (Sept. 2005), 283--304.Google ScholarGoogle ScholarCross RefCross Ref
  56. EU Parliament. 2007. Directive 2007/2/EC of the European parliament and of the council of 14 march 2007 establishing an infrastructure for spatial information in the european community (INSPIRE). Official Journal of the European Union 50, L108 (April 2007).Google ScholarGoogle Scholar
  57. Thomas Fahringer, Radu Prodan, Rubing Duan, Jüurgen Hofer, Farrukh Nadeem, Francesco Nerieri, Stefan Podlipnig, Jun Qin, Mumtaz Siddiqui, and others. 2007. ASKALON: A development and grid computing environment for scientific workflows. In Workflows for e-Science: Scientific Workflows for Grids, Ian J. Taylor, Ewa Deelman, Dennis B. Gannon, and Matthew Shields (Eds.). Springer London, 450--471.Google ScholarGoogle Scholar
  58. Zbyněk Falt, David Bednárek, Martin Kruliš, Jakub Yaghob, and Filip Zavoral. 2014. Bobolang: A language for parallel streaming applications. In Proc. HPDC’14. ACM, New York, NY, USA, 311--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Rosa Filgueira, Malcolm Atkinson, Yusuke Tanimura, and Isao Kojima. 2014. Applying selectively parallel I/O compression to parallel storage systems. In Euro-Par 2014 Parallel Processing. LNCS, Vol. 8632. 282--293.Google ScholarGoogle Scholar
  60. Rosa Filguiera, Amrey Krause, Malcolm Atkinson, Iraklis Klampanos, and Alexander Moreno. 2016. dispel4py: A python framework for data-intensive scientific computing. International Journal of High Performance Computing Applications (2016), 1--19.Google ScholarGoogle Scholar
  61. Ian Foster, Jens Vöckler, Michael Wilde, and Yong Zhao. 2002. Chimera: A virtual data system for representing, querying, and automating data derivation. In Proc. SSDBM’02. 37--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Scott W. French and Barbara Romanowicz. 2015. Broad plumes rooted at the base of the Earth’s mantle beneath major hotspots. Nature 525, 7567 (03 09 2015), 95--99. 10.1038/nature14876.Google ScholarGoogle Scholar
  63. Dennis Gannon. 2007. Component architectures and services: From application construction to scientific workflows. In Workflows for e-Science: Scientific Workflows for Grids, Ian J. Taylor, Ewa Deelman, Dennis B. Gannon, and Matthew Shields (Eds.). Springer London, 174--189.Google ScholarGoogle Scholar
  64. Daniel Garijo. 2015. Mining Abstractions in Scientific Workflows. Ph.D. Dissertation. Departamento de Inteligencia Artficial Escuela Técnica Superior de Ingenieros Informáticos, Madrid, Spain.Google ScholarGoogle Scholar
  65. Daniel Garijo, Facultad De Informática, and Yolanda Gil. 2012. Towards Open Publication of Reusable Scientific Workflows: Abstractions, Standards and Linked Data. Technical Report. (Jan. 2012).Google ScholarGoogle Scholar
  66. Sandra Gesing, Malcolm Atkinson, Rosa Filgueira, Ian Taylor, Andrew Jones, Vlado Stankovski, Chee Sun Liew, Alessandro Spinuso, Gabor Terstyanszky, and Peter Kacsuk. 2014. Workflows in a dashboard: A new generation of usability. In Proc. WORKS’14. IEEE Press, Piscataway, NJ, USA, 82--93. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Jayeeta Ghosh, Suresh Marru, Nikhil Singh, Kenno Vanomesslaeghe, Ye Fan, and Sudhakar Pamidighantam. 2011. Molecular parameter optimization gateway (ParamChem): Workflow management through teragrid ASTA. In Proc. TeraGrid (TG’11). ACM, 35:1--35:8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Yolanda Gil, Jihie Kim, Varun Ratnakar, and Ewa Deelman. 2006. Wings for Pegasus: A semantic approach to creating very large scientific workflows. In Proc. Workshop on OWL: Experiences and Directions (OWLED’06), Vol. 216.Google ScholarGoogle Scholar
  69. Edward Givelberg, Alexander Szalay, Kalin Kanov, and Randal Burns. 2011. An architecture for a data-intensive computer. In Proc. Network Aware Data Management (NDM’11). ACM, New York, NY, USA, 57--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Carole Goble and David De Roure. 2009. The impact of workflow tools on data-centric research. In The Fourth Paradigm: Data-Intensive Scientific Discovery, Tony Hey, Stewart Tansley, and Kristin Tolle (Eds.). Microsoft, 137--145.Google ScholarGoogle Scholar
  71. Katharina Görlach, Mirko Sonntag, Dimka Karastoyanova, Frank Leymann, and Michael Reiter. 2011. Conventional workflow technology for scientific simulation. In Guide to e-Science. 323--352.Google ScholarGoogle Scholar
  72. Ian Gorton, Paul Greenfield, Alex Szalay, and Roy Williams. 2008. Data-intensive computing in the 21st century. Computer 41, 4 (April 2008), 30--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Jim Gray. 2009. Jim gray on escience: A transformed scientific method. In The Fourth Paradigm: Data-Intensive Scientific Discovery, Tony Hey, Stewart Tansley, and Kristin Tolle (Eds.). Microsoft, xix--xxxiii.Google ScholarGoogle Scholar
  74. Paul Grefen and Jochem Vonk. 2006. A taxonomy of transactional workflow support. International Journal of Cooperative Information Systems 15, 1 (March 2006), 87--118.Google ScholarGoogle ScholarCross RefCross Ref
  75. Paul Groth, Yolanda Gil, James Cheney, and Simon Miles. 2012. Requirements for provenance on the web. International Journal of Digital Curation 7, 1 (2012), 39--55.Google ScholarGoogle ScholarCross RefCross Ref
  76. Yunhong Gu and Robert L. Grossman. 2009. Sector and sphere: The design and implementation of a high-performance data cloud. Phil. Trans. R. Soc. A 367, 1897 (June 2009), 2429--2445.Google ScholarGoogle ScholarCross RefCross Ref
  77. Thilina Gunarathne, Chathura Herath, Eran Chinthaka, and Suresh Marru. 2009. Experience with adapting a WS-BPEL runtime for escience workflows. In Proc. GCE’09. ACM, 7:1--7:10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Ákos Hajnal, Zoltán Farkas, Péter Kacsuk, and Tamás Pintér. 2014. Remote storage resource management in WS-PGRADE/gUSE. In Science Gateways for Distributed Computing Infrastructures: Development Framework and Exploitation by Scientific User Communities, Péter Kacsuk (Ed.). Springer, Chapter 5, 69--81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Mihael Hategan, Justin Wozniak, and Ketan Maheshwari. 2011. Coasters: Uniform resource provisioning and access for clouds and grids. In Proc. 4th IEEE International Conference on Utility and Cloud Computing (UCC’11). IEEE Computer Society, 114--121. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. George Heald, Michael Bell, Andreas Horneffer, André Offringa, Roberto Pizzo, Sebastiaan van der Tol, Reinout van Weeren, Joris van Zwieten, James Anderson, and others. 2011. LOFAR: Recent imaging results and future prospects. Journal of Astrophysics and Astronomy 32, 4 (Dec. 2011), 1--10.Google ScholarGoogle ScholarCross RefCross Ref
  81. Tom Heath and Christian Bizer. 2011. Linked Data: Evolving the Web into a Global Data Space (1st ed.). Number 1-136 in Synthesis Lectures on the Semantic Web: Theory and Technology. Morgan 8 Claypool. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Tony Hey, Stewart Tansley, and Kristin Tolle (Eds.). 2009. The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research.Google ScholarGoogle Scholar
  83. Interagency Working Group on Digital Data. 2009. Harnessing the Power of Digital Data for Science and Society: Report of the Interagency Working Group on Digital Data to the National Science and Technology Council. Technical Report. Executive office of the President, Office of Science and Technology, USA.Google ScholarGoogle Scholar
  84. Gideon Juve and Ewa Deelman. 2010. Scientific workflows and clouds. Crossroads 16, 3 (March 2010), 14--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Péter Kacsuk (Ed.). 2014. Science Gateways for Distributed Computing Infrastructures: Development Framework and Exploitation by Scientific User Communities. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Peter Kacsuk, Zoltan Farkas, Miklos Kozlovszky, Gabor Hermann, Akos Balasko, Krisztian Karoczkai, and Istvan Marton. 2012. WS-PGRADE/gUSE Generic DCI gateway framework for a large variety of user communities. Journal of Grid Computing 10, 4 (2012), 601--630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Peter Kacsuk, Gabor Terstyánszky, Ákos Balaskó,, Krisztian Karóczkai, and Zoltan Farkas. 2014. Executing multi-workflow simulations on mixed cloud and grid infrastructure using the SHIWA and SCI-BUS technology. In Cloud Computing and Big Data, C. Catlett, W. Gentzsch, L. Grandinetti, and G. Joubert (Eds.). Ios Pr Inc, 141--162.Google ScholarGoogle Scholar
  88. Douglas B. Kell and Stephen G. Oliver. 2004. Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era. BioEssays 26, 1 (Jan. 2004), 99--105.Google ScholarGoogle ScholarCross RefCross Ref
  89. Steve Kelling, Daniel Fink, Wesley Hochachka, Ken Rosenberg, Robert Cook, Theodoros Damoulas, Claudio Silva, and William Michener. 2013. Estimating species distributions -- across space, through time and with features of the environment. In The Data Bonanza -- Improving Knowledge Discovery for Science, Engineering and Business, Malcolm P. Atkinson, Rob Baxter, Paolo Besana, Michelle Galea, Mark Parsons, Peter Brezany, Oscar Corcho, Jano van Hemert, and David Snelling (Eds.). John Wiley 8 Sons Inc., Hoboken, NJ, USA, Chapter 22, 441--458.Google ScholarGoogle Scholar
  90. Jihie Kim, Ewa Deelman, Yolanda Gil, Gaurang Mehta, and Varun Ratnakar. 2008. Provenance trails in the Wings/Pegasus system. Concurrency and Computation: Practice and Experience 20, 5 (April 2008), 587--597. Google ScholarGoogle ScholarDigital LibraryDigital Library
  91. Hoyt Koepke. 2014. Why Python Rocks for Research. Technical Report. University of Washington.Google ScholarGoogle Scholar
  92. Sven Kohler, Supriya Gulati, Gongjing Cao, Quinn Hart, and Bertram Ludascher. 2012. Sliding window calculations on streaming data using the Kepler scientific workflow system. Procedia Computer Science 9, 0 (2012), 1639--1646.Google ScholarGoogle ScholarCross RefCross Ref
  93. Vladimir Korkhov, Dagmar Krefting, Tamas Kukla, Gabor Z. Terstyánszky, Matthan W. A. Caan, and Silvia D. Olabarriaga. 2013. Exploring workflow interoperability for neuroimage analysis on the SHIWA platform. Journal of Grid Computing 11, 3 (2013), 505--522. Google ScholarGoogle ScholarDigital LibraryDigital Library
  94. Miklos Kozlovszky, Krisztián Karóczkai, István Márton, Péter Kacsuk, and Tibor Gottdank. 2014. DCI Bridge: Executing WS-PGRADE workflows in distributed computing infrastructures. In Science Gateways for Distributed Computing Infrastructures: Development Framework and Exploitation by Scientific User Communities, Péter Kacsuk (Ed.). Springer, Chapter 4, 51--67.Google ScholarGoogle Scholar
  95. Michael Litzkow, Miron Livny, and Matthew Mutka. 1988. Condor - A hunter of idle workstations. In Proc. 8th International Conference of Distributed Computing Systems. IEEE Computer Society Press, 104--111.Google ScholarGoogle ScholarCross RefCross Ref
  96. Xavier Llorà, Bernie Ács, Loretta S. Auvil, Boris Capitanu, Michael E. Welge, and David E. Goldberg. 2008. Meandre: Semantic-driven data-intensive flows in the clouds. In Proc. e-Science’08. 238--245. Google ScholarGoogle ScholarDigital LibraryDigital Library
  97. Bertram Ludäscher, Ilkay Altintas, Chad Berkley, Dan Higgins, Efrat Jaeger, Matthew Jones, Edward A. Lee, Jing Tao, and Yang Zhao. 2006. Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience 18, 10 (August 2006), 1039--1065. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. Bertram Ludäscher, Mathias Weske, Timothy McPhillips, and Shawn Bowers. 2009. Scientific workflows: Business as usual? In Business Process Management. LNCS, Vol. 5701. 31--47.Google ScholarGoogle ScholarCross RefCross Ref
  99. Philip Maechling, Ewa Deelman, Li Zhao, Robert Graves, Gaurang Mehta, Nitin Gupta, John Mehringer, Carl Kesselman, Scott Callaghan, David Okaya, Hunter Francoeur, Vipin Gupta, Yifeng Cui, Karan Vahi, Thomas Jordan, and Edward Field. 2007. SCEC cybershake workflows—Automating probabilistic seismic hazard analysis calculations. In Workflows for e-Science: Scientific Workflows for Grids, Ian J. Taylor, Ewa Deelman, Dennis B. Gannon, and Matthew Shields (Eds.). Springer London, 143--163.Google ScholarGoogle Scholar
  100. Ketan Maheshwari, Alex Rodriguez, David Kelly, Ravi Madduri, Justin Wozniak, Michael Wilde, and Ian Foster. 2013. Enabling multi-task computation on Galaxy-based gateways using swift. In Proc. IEEE International Conference on Cluster Computing (CLUSTER 2013). 1--3.Google ScholarGoogle ScholarCross RefCross Ref
  101. Suresh Marru, Lahiru Gunathilake, Chathura Herath, Patanachai Tangchaisin, Marlon Pierce, Chris Mattmann, Raminder Singh, Thilina Gunarathne, Eran Chinthaka, and others. 2011. Apache airavata: A framework for distributed applications and computational workflows. In Proc. GCE’11. ACM, 21--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  102. Suresh Marru, Marlon Pierce, Sudhakar Pamidighantam, and Chathuri Wimalasena. 2015. Apache airavata as a laboratory: Architecture and case study for component-based gateway middleware. In Proc. SCREAM’15. 19--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  103. Paul Martin and Gagarine Yaikhom. 2013. Definition of the DISPEL language. In The Data Bonanza -- Improving Knowledge Discovery for Science, Engineering and Business, Malcolm P. Atkinson, Rob Baxter, Paolo Besana, Michelle Galea, Mark Parsons, Peter Brezany, Oscar Corcho, Jano van Hemert, and David Snelling (Eds.). John Wiley 8 Sons Inc., Hoboken, NJ, USA, Chapter 10, 203--236.Google ScholarGoogle Scholar
  104. Cherian Mathew, Anton Güntsch, Matthias Obst, Saverio Vicario, Robert Haines, Alan Williams, Yde de Jong, and Carole Goble. 2014. A semi-automated workflow for biodiversity data retrieval, cleaning, and quality control. Biodiversity Data Journal 2 (Dec. 2014), e4221.Google ScholarGoogle Scholar
  105. Michael McLennan, Steven Clark, Ewa Deelman, Mats Rynge, Karan Vahi, Frank McKenna, Derrick Kearney, and Carol Song. 2015. HUBzero and Pegasus: Integrating scientific workflows into science gateways. Concurrency and Computation: Practice and Experience 27, 2 (2015), 328--343.Google ScholarGoogle ScholarCross RefCross Ref
  106. Michael McLennan and Rick Kennell. 2010. HUBzero: A platform for dissemination and collaboration in computational science and engineering. Computing in Science Engineering 12, 2 (March 2010), 48--53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  107. Timothy M. McPhillips and Shawn Bowers. 2005. An approach for pipelining nested collections in scientific workflows. SIGMOD Record 34, 3 (Sept. 2005), 12--17. Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. William Michener, James Beach, Shawn Bowers, Laura Downey, Matthew Jones, Bertram Ludäscher, Deana Pennington, Arcot Rajasekar, Samantha Romanello, Mark Schildhauer, Dave Vieglais, and Jianting Zhang. 2005. Data integration and workflow solutions for ecology. In Data Integration in the Life Sciences. LNCS, Vol. 3615. 734--734. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. Paolo Missier, Bertram Ludascher, Shawn Bowers, Saumen Dey, Anandarup Sarkar, Biva Shrestha, Ilkay Altintas, Manish Kumar Anand, and Carole Goble. 2010a. Linking multiple workflow provenance traces for interoperable collaborative science. In WORKS’10. 1--8.Google ScholarGoogle Scholar
  110. Paolo Missier, Bertram Ludäscher, Saumen C. Dey, Michael Wang, Timothy M. McPhillips, Shawn Bowers, Michael Agun, and Ilkay Altintas. 2012. Golden trail: Retrieving the data history that matters from a comprehensive provenance repository. IJDC 7, 1 (2012), 139--150.Google ScholarGoogle ScholarCross RefCross Ref
  111. Paolo Missier, Stian Soiland-Reyes, Stuart Owen, Wei Tan, Alexandra Nenadic, Ian Dunlop, Alan Williams, Tom Oinn, and Carole Goble. 2010b. Taverna, Reloaded. In Scientific and Statistical Database Management. LNCS, Vol. 6187. 471--481. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. Fiona Murphy, Publishing Data Workflows WG, Theodora Bloom, Sunje Dallmeier-Tiessen, Claire C. Austin, Angus Whyte, Jonathan Tedds, Amy Nurnberger, Lisa Raymond, Martina Stockhause, and Mary Vardigan. 2015. WDS-RDA Publishing Data Workflows Working Group Analysis sheet. (June 2015).Google ScholarGoogle Scholar
  113. James Myers, Margaret Hedstrom, Dharma Akmon, Sandy Payette, Beth A. Plale, Inna Kouper, Scott McCaulay, Robert McDonald, Isuru Suriarachchi, and others. 2015. Towards sustainable curation and preservation. In Proc. e-Science’15. 526--535. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. Michael L. Norman and Allan Snavely. 2010. Accelerating data-intensive science with Gordon and Dash. In Proc. TG’10. ACM, New York, NY, USA, Article 14, 7 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. Thomas Oinn, Matthew Addis, Justin Ferris, Darren Marvin, Martin Senger, Mark Greenwood, Tim Carver, Kevin Glover, Matthew Pocock, Anil Wipat, and Peter Li. 2004. Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20, 17 (Nov. 2004), 3045--3054. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Tom Oinn, Mark Greenwood, Matthew Addis, M. Nedim Alpdemir, Justin Ferris, Kevin Glover, Carole Goble, Antoon Goderis, Duncan Hull, and others. 2006. Taverna: Lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice and Experience 18, 10 (2006), 1067--1100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  117. Tom Oinn, Peter Li, Douglas B. Kell, Carole Goble, Antoon Goderis, Mark Greenwood, Duncan Hull, Robert Stevens, Daniele Turi, and Jun Zhao. 2007. Taverna/myGrid: Aligning a workflow system with the life sciences community. In Workflows for e-Science: Scientific Workflows for Grids, Ian J. Taylor, Ewa Deelman, Dennis B. Gannon, and Matthew Shields (Eds.). Springer London, 300--319.Google ScholarGoogle Scholar
  118. Ioan Raicu, Yong Zhao, Catalin Dumitrescu, Ian Foster, and Mike Wilde. 2007. Falkon: A fast and light-weight tasK executiON framework. In Proc. SC’07. ACM, New York, NY, USA, Article 43, 12 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  119. Christopher Rawlings. 2014. Big data in the agricultural and ecological sciences — a growing challenge. Keynote EGI Community Forum 2014. (May 2014).Google ScholarGoogle Scholar
  120. A. T. Ringler, M. T. Hagerty, J. Holland, A. Gonzales, L. S. Gee, J. D. Edwards, D. Wilson, and A. M. Baker. 2015. The data quality analyzer: A quality control program for seismic data. Computers 8 Geosciences 76 (2015), 96--111. Google ScholarGoogle ScholarDigital LibraryDigital Library
  121. David Rogers, Ian Harvey, Tram Truong Huu, Kieran Evans, Tristan Glatard, Ibrahim Kallel, Ian Taylor, Johan Montagnat, Andrew Jones, and Andrew Harrison. 2013. Bundle and pool architecture for multi-language, robust, scalable workflow executions. Journal of Grid Computing 11, 3 (2013), 457--480. Google ScholarGoogle ScholarDigital LibraryDigital Library
  122. John W. Romein, Jan David Mol, Rob V. van Nieuwpoort, and P. Chris Broekema. 2011. Processing LOFAR telescope data in real time on a blue Gene/P supercomputer. In General Assembly and Scientific Symposium, 2011 XXXth URSI. 1--4.Google ScholarGoogle Scholar
  123. Susanna-Assunta Sansone, Philippe Rocca-Serra, Dawn Field, Eamonn Maguire, Chris Taylor, Oliver Hofmann, Hong Fang, Steffen Neumann, Weida Tong, and others. 2012. Toward interoperable bioscience data. Nat. Genet. 44, 2 (02 2012), 121--126.Google ScholarGoogle Scholar
  124. Idafen Santana-Perez, Rafael Ferreira da Silva, Mats Rynge, Ewa Deelman, María S. Pérez-Hernández, and Oscar Corcho. 2016. Reproducibility of execution environments in computational science using Semantics and Clouds. Future Gener. Comput. Syst. 67 (2016), 354--367.Google ScholarGoogle ScholarCross RefCross Ref
  125. Matthew Shields. 2007. Control- versus data-driven workflows. In Workflows for e-Science: Scientific Workflows for Grids, Ian J. Taylor, Ewa Deelman, Dennis B. Gannon, and Matthew Shields (Eds.). Springer London, 167--173.Google ScholarGoogle Scholar
  126. Yogesh L. Simmhan, Roger Barga, Catharine van Ingen, Ed Lazowska, and Alex Szalay. 2009. Building the trident scientific workflow workbench for data management in the cloud. In Proc. 3rd International Conference on Advanced Engineering Computing and Applications in Sciences (ADVCOMP’09). 41--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  127. Aleksander Slominski. 2007. Adapting BPEL to scientific workflows. In Workflows for e-Science: Scientific Workflows for Grids, Ian J. Taylor, Ewa Deelman, Dennis B. Gannon, and Matthew Shields (Eds.). Springer London, 208--226.Google ScholarGoogle Scholar
  128. Alessandro Spinuso, Rosa Fligueira, Malcolm Atkinson, and Andre Gemuend. 2016. Visualisation methods for large provenance collections in data-intensive collaborative platforms. In Geophysical Research Abstracts - EGU General Assembly 2016, Vol. 18.Google ScholarGoogle Scholar
  129. Sudarshan Srinivasan, Gideon Juve, Rafael Ferreira da Silva, Karan Vahi, and Ewa Deelman. 2014. A cleanup algorithm for implementing storage constraints in scientific workflow executions. In Proc. WORKS’14. IEEE Press, 41--49. Google ScholarGoogle ScholarDigital LibraryDigital Library
  130. Tiberiu Stef-Praun, Benjamin Clifford, Ian Foster, Uri Hasson, Mihael Hategan, Steven L. Small, Michael Wilde, and Yong Zhao. 2007. Accelerating medical research using the swift workflow system. Studies in Health Technology and Informatics 126 (2007), 207--216.Google ScholarGoogle Scholar
  131. Michael Stonebraker, Jacek Becla, David J. DeWitt, Kian-Tat Lim, David Maier, Oliver Ratzesberger, and Stanley B. Zdonik. 2009. Requirements for science data bases and SciDB. In Proc. Biennial Conference on Innovative Data Systems Research (CIDR’09).Google ScholarGoogle Scholar
  132. Michael Stonebraker, Paul Brown, Donghui Zhang, and Jacek Becla. 2013. SciDB: A database management system for applications with complex analytics. Computing in Science 8 Engineering 15, 3 (2013), 54--62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  133. Ian Taylor, Matthew Shields, Ian Wang, and Andrew Harrison. 2007a. The Triana workflow environment: Architecture and applications. In Workflows for e-Science: Scientific Workflows for Grids, Ian J. Taylor, Ewa Deelman, Dennis B. Gannon, and Matthew Shields (Eds.). Springer London, 320--339.Google ScholarGoogle Scholar
  134. Ian J. Taylor, Ewa Deelman, Dennis B. Gannon, and Matthew Shields. 2007b. Workflows for e-Science: Scientific workflows for grids. Springer London. Google ScholarGoogle ScholarDigital LibraryDigital Library
  135. Gabor Terstyánszky, Edward Michniak, Tamás Kiss, and Ákos Balaskó. 2014. Sharing science gateway artefacts through repositories. In Science Gateways for Distributed Computing Infrastructures: Development Framework and Exploitation by Scientific User Communities. Springer, Chapter 9, 123--135.Google ScholarGoogle Scholar
  136. Douglas Thain, Todd Tannenbaum, and Miron Livny. 2005. Distributed computing in practice: The Condor experience. Concurrency and Computation: Practice and Experience 17, 2-4 (2005), 323--356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  137. Thomas D. Uram, Michael E. Papka, Mark Hereld, and Michael Wilde. 2011. A solution looking for lots of problems: generic portals for science infrastructure. In Proc. TG’11. ACM, New York, NY, USA, Article 44, 7 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  138. Wil M. P. van der Aalst and Arthur H. M. ter Hofstede. 2014. Workflow Patterns. http://www.workflowpatterns.com. (2014).Google ScholarGoogle Scholar
  139. Wil M. P. van der Aalst, Arthur H. M. ter Hofstede, B. Kiepuszewski, and A. P. Barros. 2003. Workflow Patterns. Distributed and Parallel Databases 14, 1 (July 2003), 5--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  140. Jens Vöckler, Gaurang Mehta, Yong Zhao, Ewa Deelman, and Michael Wilde. 2006. Kickstarting Remote Applications. In Second International Workshop on Grid Computing Environments.Google ScholarGoogle Scholar
  141. Gregor von Laszewski and Mike Hategan. 2005. Workflow Concepts of the Java CoG Kit. Journal of Grid Computing 3, 3 (Sept. 2005), 239--258.Google ScholarGoogle ScholarCross RefCross Ref
  142. Chip Walter. 2005. Kryder’s Law: The doubling of processor speed every 18 months is a snail’s pace compared with rising hard-disk capacity, and Mark Kryder plans to squeeze in even more bits. Scientific American (August 2005), 32--33.Google ScholarGoogle Scholar
  143. Hongbing Wang, Joshua Zhexue Huang, Yuzhong Qu, and Junyuan Xie. 2004. Web services: Problems and future directions. Web Semantics: Science, Services and Agents on the World Wide Web 1, 3 (April 2004), 309--320.Google ScholarGoogle ScholarCross RefCross Ref
  144. Marek Wieczorek, Andreas Hoheisel, and Radu Prodan. 2009. Towards a general model of the multi-criteria workflow scheduling on the grid. Future Gener. Comput. Syst. 25, 3 (March 2009), 237--256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  145. Michael Wilde, Ian Foster, Kamil Iskra, Pete Beckman, Zhao Zhang, Allan Espinosa, Mihael Hategan, Ben Clifford, and Ioan Raicu. 2009. Parallel scripting for applications at the petascale and beyond. Computer 42, 11 (Nov. 2009), 50--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  146. Matthew Woitaszek, John M. Dennis, and Taleena R. Sine. 2011. Parallel high-resolution climate data analysis using swift. In Proc. ACM International Workshop on Many Task Computing on Grids and Supercomputers (MTAGS’11). ACM, New York, NY, USA, 5--14. Google ScholarGoogle ScholarDigital LibraryDigital Library
  147. Katherine Wolstencroft, Robert Haines, Donal Fellows, Alan Williams, David Withers, Stuart Owen, Stian Soiland-Reyes, Ian Dunlop, Aleksandra Nenadic, and others. 2013. The Taverna workflow suite: Designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic Acids Research 41, W1 (2013), W557--W561.Google ScholarGoogle ScholarCross RefCross Ref
  148. Justin M. Wozniak, Timothy G. Armstrong, Ketan Maheshwari, Ewing L. Lusk, Daniel S. Katz, Michael Wilde, and Ian T. Foster. 2013a. Turbine: A distributed-memory dataflow engine for high performance many-task applications. Fundamenta Informaticae 128, 3 (01 2013), 337--366. Google ScholarGoogle ScholarDigital LibraryDigital Library
  149. Justin M. Wozniak, Timothy G. Armstrong, Michael Wilde, Daniel S. Katz, Ewing Lusk, and Ian T. Foster. 2013b. Swift/T: Large-scale application composition via distributed-memory dataflow processing. In Proc. IEEE/ACM CCGRID’13. 95--102.Google ScholarGoogle Scholar
  150. Wenjun Wu, Thomas Uram, Michael Wilde, Mark Hereld, and Michael E. Papka. 2010. Accelerating science gateway development with Web 2.0 and Swift. In Proc. TG’10. ACM, New York, NY, USA, Article 23, 7 pages. Google ScholarGoogle ScholarDigital LibraryDigital Library
  151. Youngik Yang, Jong Youl Choi, Chathura Herath, Suresh Marru, and Sun Kim. 2010. Biovlab:Bioinformatics data analysis using cloud computing and graphical workflow composers. In Cloud Computing and Software Services: Theory and Techniques, Syed A. Ahson and Mohammad Ilyas (Eds.). Number 309-327. CRC Press, Inc.Google ScholarGoogle Scholar
  152. Jia Yu and Rajkumar Buyya. 2005. A taxonomy of workflow management systems for grid computing. Journal of Grid Computing 3, 3--4 (Sept. 2005), 171--200.Google ScholarGoogle ScholarCross RefCross Ref
  153. Yong Zhao, Mihael Hategan, Ben Clifford, Ian Foster, Gregor von Laszewski, Veronika Nefedova, Ioan Raicu, Tiberiu Stef-Praun, and Michael Wilde. 2007. Swift: Fast, reliable, loosely coupled parallel computation. In Proc. IEEE SERVICES’07. IEEE Computer Society, 199--206.Google ScholarGoogle ScholarCross RefCross Ref
  154. Yong Zhao, Youfu Li, Ioan Raicu, Shiyong Lu, Wenhong Tian, and Heng Liu. 2015. Enabling scalable scientific workflow management in the Cloud. Future Gener. Comput. Syst. 46 (2015), 3--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  155. Zhiming Zhao, Paola Grosso, Jeroen van der Ham, Ralph Koning, and Cees de Laat. 2011. An agent based network resource planner for workflow applications. Multiagent and Grid Systems 7, 6 (2011), 187--202. Google ScholarGoogle ScholarDigital LibraryDigital Library
  156. Daniel Zinn, Quinn Hart, Timothy McPhillips, Bertram Ludäscher, Yogesh Simmhan, Michail Giakkoupis, and Viktor K. Prasanna. 2011. Towards reliable, performant workflows for streaming-applications on cloud platforms. In Proc. IEEE/ACM CCGRID’11. 235--244. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Scientific Workflows: Moving Across Paradigms

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Computing Surveys
              ACM Computing Surveys  Volume 49, Issue 4
              December 2017
              666 pages
              ISSN:0360-0300
              EISSN:1557-7341
              DOI:10.1145/3022634
              • Editor:
              • Sartaj Sahni
              Issue’s Table of Contents

              Copyright © 2016 ACM

              © 2016 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 12 December 2016
              • Accepted: 1 September 2016
              • Revised: 1 May 2016
              • Received: 1 March 2015
              Published in csur Volume 49, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • survey
              • Research
              • Refereed

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader