Abstract
Crowdsourcing platforms provide tools to replicate and distribute micro tasks (simple, independent work units) to crowds and assemble results. However, real-life problems are often complex: they require to collect, organize or transform data, with quality and costs constraints. This work considers dynamic realization policies for complex crowdsourcing tasks. Workflows provide ways to organize a complex task in phases and guide its realization. The challenge is then to deploy a workflow on a crowd, i.e., allocate workers to phases so that the overall workflow terminates, with good accuracy of results and at a reasonable cost. Standard “static” allocation of work in crowdsourcing affects a fixed number of workers per micro-task to realize and aggregates the results. We define new dynamic worker allocation techniques that consider progress in a workflow, quality of synthesized data, and remaining budget. Evaluation on a benchmark shows that dynamic approaches outperform static ones in terms of cost and accuracy.
Work supported by the Headwork ANR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bourhis, P., Hélouët, L., Miklos, Z., Singh, R.: Data centric workflows for crowdsourcing. Proc. Petri Nets 2020, 46–61 (2020)
Dai, P., Lin, C.H., Weld, D.S.: Pomdp-based control of workflows for crowdsourcing. Artif. Intell. 202, 52–85 (2013)
Daniel, F., Kucherbaev, P., Cappiello, C., Benatallah, B., Allahbakhsh, M.: Quality control in crowdsourcing: a survey of quality attributes, assessment techniques, and assurance actions. ACM Comput. Surv. 51(1), 7 (2018)
Dawid, A., Skene, A.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)
Deguines, N., Julliard, R., De Flores, M., Fontaine, C.: The whereabouts offlower visitors: contrasting land-use preferences revealed by a country-widesurvey based on citizen science. PLOS ONE 7(9), e45822 (2012)
Demartini, G., Difallah, D., Cudré-Mauroux, P.: ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the WWW 2012, pp. 469–478. ACM (2012)
Dempster, A., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B (Methodol.) 39(1), 1–22 (1977)
Flach, P.: Machine Learning - The Art and Science of Algorithms that Make Senseof Data. Cambridge University Press (2012)
Gao, Y., Parameswaran, A.G.: Finish them!: pricing algorithms for human computation. Proc. VLDB Endow. 7(14), 1965–1976 (2014)
Garcia-Molina, H., Joglekar, M., Marcus, A., Parameswaran, A., Verroios, V.: Challenges in data crowdsourcing. Trans. Knowl. Data Eng. 28(4), 901–911 (2016)
Goto, S., Ishida, T., Lin, D.: Understanding crowdsourcing workflow: modeling and optimizing iterative and parallel processes. In: Proceedings of the HCOMP 2016, pp. 52–58. AAAI Press (2016)
Gupta, M., Chen, Y.: Theory and use of the EM algorithm. Found. Trends Sig. Process. 4(3), 223–296 (2011)
Haas, D., Wang, J., Wu, E., Franklin, M.J.: CLAMShell: speeding up crowds for low-latency data labeling. Proc. VLDB Endow. 9(4), 372–383 (2015)
Hélouët, L., Miklos, Z., Singh, R.: Cost and Quality Assurance in Crowdsourcing Workflows (October 2020). Extended version. https://hal.inria.fr/hal-02964736
Karger, D., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. In: Proceedings of the NIPS 2011, pp. 1953–1961 (2011)
Kitchin, D., Cook, W., Misra, J.: A language for task orchestration and its semantic properties. In: Proceedings of the CONCUR 2006, pp. 477–491 (2006)
Kittur, A., Smus, B., Khamkar, S., Kraut, R.: CrowdForge: crowdsourcing complex work. In: Proceedings of the UIST 2011, pp. 43–52. ACM (2011)
Kulkarni, A., Can, M., Hartmann, B.: Collaboratively crowdsourcing workflows with Turkomatic. In: Proceedings of the CSCW 2012, pp. 1003–1012. ACM (2012)
Li, G., Wang, J., Zheng, Y., Franklin, M.: Crowdsourced data management: a survey. Trans. Knowl. Data Eng. 28(9), 2296–2319 (2016)
Little, G., Chilton, L., Goldman, M., Miller, R.: TurKit: tools for iterative tasks on Mechanical Turk. In: Proceedings of the HCOMP 2009, pp. 29–30. ACM (2009)
OASIS: Web Services Business Process Execution Language. Technical report, OASIS (2007)
OMG: Business Process Model and Notation (BPMN). OMG (2011)
Quinn, A., Bederson, B.: Human computation: a survey and taxonomy of a growing field. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1403–1412 (2011)
Raykar, V.C., et al.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)
Singh, R., Hélouët, L., Miklós, Z.: Reducing the cost of aggregation in crowdsourcing. In: Proceedings of the ICWS 2020 (2020)
Tran-Thanh, L., Venanzi, M., Rogers, A., Jennings, N.: Efficient budget allocation with accuracy guarantees for crowdsourcing classification tasks. In: Proceedings of the AAMAS 2013, pp. 901–908 (2013)
Tsai, C.H., Luo, H.J., Wang, F.J.: Constructing a BPM environment with BPMN. In: 11th IEEE International Workshop on Future Trends of Distributed Computing Systems, FTDCS 2007, pp. 164–172. IEEE (2007)
Van Der Aalst, W., et al.: Soundness of workflow nets: classification, decidability, and analysis. Formal Aspects Comput. 23(3), 333–363 (2011)
Wei, D., Roy, S., Amer-Yahia, S.: Recommending deployment strategies for collaborative tasks. In: Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, pp. 3–17. ACM (2020)
Whitehill, J., Wu, T., Bergsma, J., Movellan, J., Ruvolo, P.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: Proceedings of the NIPS 2009, pp. 2035–2043 (2009)
Zheng, Q., Wang, W., Yu, Y., Pan, M., Shi, X.: Crowdsourcing complex task automatically by workflow technology. In: MiPAC 2016 Workshop, pp. 17–30 (2016)
Zheng, Y., Li, G., Li, Y., Shan, C., Cheng, R.: Truth inference in crowdsourcing: is the problem solved? Proc. VLDB Endow. 10(5), 541–552 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Hélouët, L., Miklos, Z., Singh, R. (2021). Cost and Quality in Crowdsourcing Workflows. In: Buchs, D., Carmona, J. (eds) Application and Theory of Petri Nets and Concurrency. PETRI NETS 2021. Lecture Notes in Computer Science(), vol 12734. Springer, Cham. https://doi.org/10.1007/978-3-030-76983-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-76983-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-76982-6
Online ISBN: 978-3-030-76983-3
eBook Packages: Computer ScienceComputer Science (R0)