skip to main content
10.1145/2666158.2666174acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Optimization of Data-intensive Flows: Is it Needed? Is it Solved?

Published:07 November 2014Publication History

ABSTRACT

Modern data analysis is increasingly employing data-intensive flows for processing very large volumes of data. As the data flows become more and more complex and operate in a highly dynamic environment, we argue that we need to resort to automated cost-based optimization solutions rather than relying on efficient designs by human experts. We further demonstrate that the current state-of-the-art in flow optimizations needs to be extended and we propose a promising direction for optimizing flows at the logical level, and more specifically, for deciding the sequence of flow tasks.

References

  1. D. Abadi et al. The beckman database research self-assessment meeting. Technical report, 2013.Google ScholarGoogle Scholar
  2. S. Abrishami, M. Naghibzadeh, and D. H. Epema. Deadline-constrained workflow scheduling algorithms for infrastructure as a service clouds. Future Generation Computer Systems, 29(1):158 -- 169, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. Burge, K. Munagala, and U. Srivastava. Ordering pipelined query operators with precedence constraints. Technical Report 2005-40, Stanford InfoLab, 2005.Google ScholarGoogle Scholar
  4. S. Chaudhuri, U. Dayal, and V. Narasayya. An overview of business intelligence technology. Commun. ACM, 54:88--98, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. R. Dewan, A. Seidmann, and Z. Walter. Workflow optimization through task redesign in business information processes. In HICSS, pages 240--252. IEEE Computer Society, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. R. Halasipuram, P. M. Deshpande, and S. Padmanabhan. Determining essential statistics for cost based optimization of an etl workflow. In EDBT, pages 307--318, 2014.Google ScholarGoogle Scholar
  7. S. Holl, O. Zimmermann, M. Palmblad, Y. Mohammed, and M. Hofmann-Apitius. A new optimization phase for scientific workflow management systems. Future Generation Comp. Syst., 36:352--362, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  8. F. Hueske, M. Peters, M. Sax, A. Rheinlander, R. Bergmann, A. Krettek, and K. Tzoumas. Opening the black boxes in data flow optimization. PVLDB, 5(11):1256--1267, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. Kougka and A. Gounaris. On optimizing work ows using query processing techniques. In SSDBM, pages 601--606, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. G. Kougka and A. Gounaris. Declarative expression and optimization of data-intensive flows. In DaWaK, pages 13--25, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. R. Krishnamurthy, H. Boral, and C. Zaniolo. Optimization of nonrecursive queries. In VLDB, pages 128--137, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. N. Kumar and P. S. Kumar. An efficient heuristic for logical optimization of etl workflows. In BIRTE, volume 84 of Lecture Notes in Business Information Processing, pages 68--83. Springer, 2010.Google ScholarGoogle Scholar
  13. E. S. Ogasawara, D. de Oliveira, P. Valduriez, J. Dias, F. Porto, and M. Mattoso. An algebraic approach for data-centric scientific workflows. PVLDB, 4:1328--1339, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Simitsis, P. Vassiliadis, and T. K. Sellis. State-space optimization of etl workflows. IEEE Trans. Knowl. Data Eng., 17(10):1404--1419, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Y. L. Varol and D. Rotem. An algorithm to generate all topological sorting arrangements. The Computer Journal, 24(1):83--84, 1981.Google ScholarGoogle ScholarCross RefCross Ref
  16. M. Vrhovnik, H. Schwarz, O. Suhre, B. Mitschang, V. Markl, A. Maier, and T. Kraft. An approach to optimize data processing in business processes. In VLDB, pages 615--626, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Z. Xiao, H. Chang, and Y. Yi. Optimization of workflow resources allocation with cost constraint. In Proc. of the 10th Int. Conf. on Computer supported cooperative work in design, pages 647--656, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Optimization of Data-intensive Flows: Is it Needed? Is it Solved?

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DOLAP '14: Proceedings of the 17th International Workshop on Data Warehousing and OLAP
      November 2014
      110 pages
      ISBN:9781450309998
      DOI:10.1145/2666158

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 7 November 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      DOLAP '14 Paper Acceptance Rate8of22submissions,36%Overall Acceptance Rate29of79submissions,37%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader