Skip to main content

Cost and Quality in Crowdsourcing Workflows

  • Conference paper
  • First Online:
Application and Theory of Petri Nets and Concurrency (PETRI NETS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12734))

Abstract

Crowdsourcing platforms provide tools to replicate and distribute micro tasks (simple, independent work units) to crowds and assemble results. However, real-life problems are often complex: they require to collect, organize or transform data, with quality and costs constraints. This work considers dynamic realization policies for complex crowdsourcing tasks. Workflows provide ways to organize a complex task in phases and guide its realization. The challenge is then to deploy a workflow on a crowd, i.e., allocate workers to phases so that the overall workflow terminates, with good accuracy of results and at a reasonable cost. Standard “static” allocation of work in crowdsourcing affects a fixed number of workers per micro-task to realize and aggregates the results. We define new dynamic worker allocation techniques that consider progress in a workflow, quality of synthesized data, and remaining budget. Evaluation on a benchmark shows that dynamic approaches outperform static ones in terms of cost and accuracy.

Work supported by the Headwork ANR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.mturk.com.

  2. 2.

    www.appen.com.

  3. 3.

    www.wirk.com.

  4. 4.

    docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html.

References

  1. Bourhis, P., Hélouët, L., Miklos, Z., Singh, R.: Data centric workflows for crowdsourcing. Proc. Petri Nets 2020, 46–61 (2020)

    Google Scholar 

  2. Dai, P., Lin, C.H., Weld, D.S.: Pomdp-based control of workflows for crowdsourcing. Artif. Intell. 202, 52–85 (2013)

    Article  MathSciNet  Google Scholar 

  3. Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., Allahbakhsh, M.: Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions. ACM Comput. Surv. 51(1), 7 (2018)

    Article  Google Scholar 

  4. Dawid, A., Skene, A.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)

    Google Scholar 

  5. Deguines, N., Julliard, R., De Flores, M., Fontaine, C.: The whereabouts offlower visitors: contrasting land-use preferences revealed by a country-widesurvey based on citizen science. PLOS ONE 7(9), e45822 (2012)

    Article  Google Scholar 

  6. Demartini, G., Difallah, D., Cudré-Mauroux, P.: ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the WWW 2012, pp. 469–478. ACM (2012)

    Google Scholar 

  7. Dempster, A., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B (Methodol.) 39(1), 1–22 (1977)

    MathSciNet  MATH  Google Scholar 

  8. Flach, P.: Machine Learning - The Art and Science of Algorithms that Make Senseof Data. Cambridge University Press (2012)

    Google Scholar 

  9. Gao, Y., Parameswaran, A.G.: Finish them!: pricing algorithms for human computation. Proc. VLDB Endow. 7(14), 1965–1976 (2014)

    Article  Google Scholar 

  10. Garcia-Molina, H., Joglekar, M., Marcus, A., Parameswaran, A., Verroios, V.: Challenges in data crowdsourcing. Trans. Knowl. Data Eng. 28(4), 901–911 (2016)

    Article  Google Scholar 

  11. Goto, S., Ishida, T., Lin, D.: Understanding crowdsourcing workflow: modeling and optimizing iterative and parallel processes. In: Proceedings of the HCOMP 2016, pp. 52–58. AAAI Press (2016)

    Google Scholar 

  12. Gupta, M., Chen, Y.: Theory and use of the EM algorithm. Found. Trends Sig. Process. 4(3), 223–296 (2011)

    Article  Google Scholar 

  13. Haas, D., Wang, J., Wu, E., Franklin, M.J.: CLAMShell: speeding up crowds for low-latency data labeling. Proc. VLDB Endow. 9(4), 372–383 (2015)

    Article  Google Scholar 

  14. Hélouët, L., Miklos, Z., Singh, R.: Cost and Quality Assurance in Crowdsourcing Workflows (October 2020). Extended version. https://hal.inria.fr/hal-02964736

  15. Karger, D., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Proceedings of the NIPS 2011, pp. 1953–1961 (2011)

    Google Scholar 

  16. Kitchin, D., Cook, W., Misra, J.: A language for task orchestration and its semantic properties. In: Proceedings of the CONCUR 2006, pp. 477–491 (2006)

    Google Scholar 

  17. Kittur, A., Smus, B., Khamkar, S., Kraut, R.: CrowdForge: crowdsourcing complex work. In: Proceedings of the UIST 2011, pp. 43–52. ACM (2011)

    Google Scholar 

  18. Kulkarni, A., Can, M., Hartmann, B.: Collaboratively crowdsourcing workflows with Turkomatic. In: Proceedings of the CSCW 2012, pp. 1003–1012. ACM (2012)

    Google Scholar 

  19. Li, G., Wang, J., Zheng, Y., Franklin, M.: Crowdsourced data management: a survey. Trans. Knowl. Data Eng. 28(9), 2296–2319 (2016)

    Article  Google Scholar 

  20. Little, G., Chilton, L., Goldman, M., Miller, R.: TurKit: tools for iterative tasks on Mechanical Turk. In: Proceedings of the HCOMP 2009, pp. 29–30. ACM (2009)

    Google Scholar 

  21. OASIS: Web Services Business Process Execution Language. Technical report, OASIS (2007)

    Google Scholar 

  22. OMG: Business Process Model and Notation (BPMN). OMG (2011)

    Google Scholar 

  23. Quinn, A., Bederson, B.: Human computation: a survey and taxonomy of a growing field. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1403–1412 (2011)

    Google Scholar 

  24. Raykar, V.C., et al.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)

    MathSciNet  Google Scholar 

  25. Singh, R., Hélouët, L., Miklós, Z.: Reducing the cost of aggregation in crowdsourcing. In: Proceedings of the ICWS 2020 (2020)

    Google Scholar 

  26. Tran-Thanh, L., Venanzi, M., Rogers, A., Jennings, N.: Efficient budget allocation with accuracy guarantees for crowdsourcing classification tasks. In: Proceedings of the AAMAS 2013, pp. 901–908 (2013)

    Google Scholar 

  27. Tsai, C.H., Luo, H.J., Wang, F.J.: Constructing a BPM environment with BPMN. In: 11th IEEE International Workshop on Future Trends of Distributed Computing Systems, FTDCS 2007, pp. 164–172. IEEE (2007)

    Google Scholar 

  28. Van Der Aalst, W., et al.: Soundness of workflow nets: classification, decidability, and analysis. Formal Aspects Comput. 23(3), 333–363 (2011)

    Article  MathSciNet  Google Scholar 

  29. Wei, D., Roy, S., Amer-Yahia, S.: Recommending deployment strategies for collaborative tasks. In: Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, pp. 3–17. ACM (2020)

    Google Scholar 

  30. Whitehill, J., Wu, T., Bergsma, J., Movellan, J., Ruvolo, P.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the NIPS 2009, pp. 2035–2043 (2009)

    Google Scholar 

  31. Zheng, Q., Wang, W., Yu, Y., Pan, M., Shi, X.: Crowdsourcing complex task automatically by workflow technology. In: MiPAC 2016 Workshop, pp. 17–30 (2016)

    Google Scholar 

  32. Zheng, Y., Li, G., Li, Y., Shan, C., Cheng, R.: Truth inference in crowdsourcing: is the problem solved? Proc. VLDB Endow. 10(5), 541–552 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Loïc Hélouët .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hélouët, L., Miklos, Z., Singh, R. (2021). Cost and Quality in Crowdsourcing Workflows. In: Buchs, D., Carmona, J. (eds) Application and Theory of Petri Nets and Concurrency. PETRI NETS 2021. Lecture Notes in Computer Science(), vol 12734. Springer, Cham. https://doi.org/10.1007/978-3-030-76983-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-76983-3_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-76982-6

  • Online ISBN: 978-3-030-76983-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics