skip to main content
10.1145/3394486.3403255acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Learning Based Distributed Tracking

Authors Info & Claims
Published:20 August 2020Publication History

ABSTRACT

Inspired by the great success of machine learning in the past decade, people have been thinking about the possibility of improving the theoretical results by exploring data distribution. In this paper, we revisit a fundamental problem called Distributed Tracking (DT) under an assumption that the data follows a certain (known or unknown) distribution, and propose a number Data-dependent algorithms with improved theoretical bounds. Informally, in the DT problem, there is a coordinator and k players, where the coordinator holds a threshold N and each player has a counter. At each time stamp, at most one counter can be increased by one. The job of the coordinator is to capture the exact moment when the sum of all these k counters reaches N. The goal is to minimise the communication cost. While our first type of algorithms assume the concrete data distribution is known in advance, our second type of algorithms can learn the distribution on the fly. Both of the algorithms achieve a communication cost bounded by O(k log log N) with high probability, improving the state-of-the-art data-independent bound O(k log N/k). We further propose a number of implementation optimisation heuristics to improve both efficiency and robustness of the algorithms. Finally, we conduct extensive experiments on three real datasets and four synthetic datasets. The experimental results show that the communication cost of our algorithms is as least as $20%$ of that of the state-of-the-art algorithms.

Skip Supplemental Material Section

Supplemental Material

3394486.3403255.mp4

mp4

276.6 MB

References

  1. Anders Aamand, Piotr Indyk, and Ali Vakilian. 2019. (Learned) Frequency Estimation Algorithms under Zipfian Distribution. CoRR, Vol. abs/1908.05198 (2019). arxiv: 1908.05198Google ScholarGoogle Scholar
  2. Jean-Yves Audibert, Ré mi Munos, and Csaba Szepesvá ri. 2009. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci., Vol. 410, 19 (2009), 1876--1902.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, and Samy Bengio. 2017. Neural Combinatorial Optimization with Reinforcement Learning. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Workshop Track Proceedings.Google ScholarGoogle Scholar
  4. Fan R. K. Chung and Lincoln Lu. 2006. Survey: Concentration Inequalities and Martingale Inequalities: A Survey. Internet Mathematics, Vol. 3, 1 (2006), 79--127.Google ScholarGoogle ScholarCross RefCross Ref
  5. Graham Cormode. 2013. The continuous distributed monitoring model. SIGMOD Record, Vol. 42, 1 (2013), 5--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Graham Cormode, Minos N. Garofalakis, S. Muthukrishnan, and Rajeev Rastogi. 2005. Holistic Aggregates in a Networked World: Distributed Tracking of Approximate Quantiles. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Baltimore, Maryland, USA, June 14--16, 2005. 25--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Graham Cormode, S. Muthukrishnan, and Ke Yi. 2011. Algorithms for distributed functional monitoring. ACM Trans. Algorithms, Vol. 7, 2 (2011), 21:1--21:20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Nikos Giatrakos, Antonios Deligiannakis, Minos N. Garofalakis, Izchak Sharfman, and Assaf Schuster. 2012. Prediction-based geometric monitoring over distributed data streams. In Proceedings of the International Conference on Management of Data, SIGMOD, Scottsdale, AZ, USA, May 20--24, 2012. 265--276.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chen-Yu Hsu, Piotr Indyk, Dina Katabi, and Ali Vakilian. 2019. Learning-Based Frequency Estimation Algorithms. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6--9, 2019.Google ScholarGoogle Scholar
  10. Zengfeng Huang, Ke Yi, and Qin Zhang. 2019. Randomized Algorithms for Tracking Distributed Count, Frequencies, and Ranks. Algorithmica, Vol. 81, 6 (2019), 2222--2243.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ram Keralapura, Graham Cormode, and Jeyashankher Ramamirtham. 2006. Communication-efficient distributed monitoring of thresholded counts. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, USA, June 27--29, 2006. 289--300.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Elias B. Khalil, Hanjun Dai, Yuyu Zhang, Bistra Dilkina, and Le Song. 2017. Learning Combinatorial Optimization Algorithms over Graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4--9 December 2017, Long Beach, CA, USA. 6348--6358.Google ScholarGoogle Scholar
  13. David Kotz, Tristan Henderson, Ilya Abyzov, and Jihwang Yeo. 2009. CRAWDAD dataset dartmouth/campus (v. 2009-09-09).Google ScholarGoogle Scholar
  14. Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In Proceedings of the 2018 International Conference on Management of Data, SIGMOD 2018. 489--504.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Michael Mitzenmacher. 2018. A Model for Learned Bloom Filters and Optimizing by Sandwiching. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3--8 December 2018, Montré al, Canada. 462--471.Google ScholarGoogle Scholar
  16. Miao Qiao, Junhao Gan, and Yufei Tao. 2016. Range Thresholding on Streams. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016. 571--582.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. 2015. Pointer Networks. In Annual Conference on Neural Information Processing Systems (NeuIPS), December 7--12, 2015, Montreal, Quebec, Canada. 2692--2700.Google ScholarGoogle Scholar

Index Terms

  1. Learning Based Distributed Tracking

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
      August 2020
      3664 pages
      ISBN:9781450379984
      DOI:10.1145/3394486

      Copyright © 2020 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 20 August 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,133of8,635submissions,13%

      Upcoming Conference

      KDD '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader