skip to main content
10.1145/3308561.3353777acmconferencesArticle/Chapter ViewAbstractPublication PagesassetsConference Proceedingsconference-collections
research-article

Automatic Generation and Evaluation of Usable and Secure Audio reCAPTCHA

Published:24 October 2019Publication History

ABSTRACT

CAPTCHAs are challenge-response tests to differentiate humans from automated agents, with tasks that are easy for humans but difficult for computers. The most common CAPTCHAs require humans to decipher characters from an image and are unsuitable for visually impaired people. As an alternative, audio CAPTCHA was proposed, which require deciphering spoken digits/letters. However, current audio CAPTCHAs suffer from low usability and are insecure against Automatic Speech Recognition (ASR) attacks. In this work, we propose reCAPGen, a system that uses ASR for generating secure CAPTCHAs. We evaluated four audio CAPTCHA schemes with 60 sighted and 19 visually impaired participants. We found that our proposed Last Two Words scheme was the most usable with success rate of >78.2% and low response time of <14.5s. Furthermore, solving our audio CAPTCHAs can transcribe unknown words with >82% accuracy.

References

  1. Luis Von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford. 2003. CAPTCHA: Using Hard AI Problems for Security. In Proceedings of the 22Nd International Conference on Theory and Applications of Cryptographic Techniques (EUROCRYPT'03). Springer-Verlag, Berlin, Heidelberg, 294--311. http://dl.acm.org/citation.cfm?id=1766171.1766196Google ScholarGoogle ScholarCross RefCross Ref
  2. Internet Archive. 2017a. Old Time Radio. (2017). Retrieved Jan 4, 2017 from https://archive.org/details/oldtimeradioGoogle ScholarGoogle Scholar
  3. Internet Archive. 2017b. Podcasts. (2017). Retrieved Jan 4, 2017 from https://archive.org/details/audio_podcastGoogle ScholarGoogle Scholar
  4. Jeffrey P. Bigham and Anna C. Cavender. 2009. Evaluating Existing Audio CAPTCHAs and an Interface Optimized for Non-visual Use. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '09). ACM, New York, NY, USA, 1829--1838. http://dx.doi.org/10.1145/1518701.1518983Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Kevin Bock, Daven Patel, George Hughey, and Dave Levin. 2017. unCaptcha: A Low-Resource Defeat of reCaptchatextquoterights Audio Challenge. In 11th USENIX Workshop on Offensive Technologies (WOOT 17). USENIX Association, Vancouver, BC. https://www.usenix.org/conference/woot17/workshop-program/presentation/bockGoogle ScholarGoogle Scholar
  6. E. Bursztein, R. Beauxis, H. Paskov, D. Perito, C. Fabry, and J. Mitchell. 2011. The Failure of Noise-Based Non-continuous Audio Captchas. In 2011 IEEE Symposium on Security and Privacy. 19--31. http://dx.doi.org/10.1109/SP.2011.14Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Elie Bursztein and Steven Bethard. 2009. Decaptcha: Breaking 75% of eBay Audio CAPTCHAs. In Proceedings of the 3rd USENIX Conference on Offensive Technologies (WOOT'09). USENIX Association, Berkeley, CA, USA, 8--8. http://dl.acm.org/citation.cfm?id=1855876.1855884Google ScholarGoogle Scholar
  8. Elie Bursztein, Steven Bethard, Celine Fabry, John C. Mitchell, and Dan Jurafsky. 2010. How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation. In Proceedings of the 2010 IEEE Symposium on Security and Privacy (SP '10). IEEE Computer Society, Washington, DC, USA, 399--413. http://dx.doi.org/10.1109/SP.2010.31Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden Voice Commands. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, Austin, TX, 513--530. https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/carliniGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  10. Google Cloud. 2017. Google Cloud Speech API. (2017). Retrieved Jan 1, 2017 from https://cloud.google.com/speech/Google ScholarGoogle Scholar
  11. William Haack, Madeleine Severance, Michael Wallace, and Jeremy Wohlwend. 2017. Security Analysis of the Amazon Echo. (2017), 1--14.Google ScholarGoogle Scholar
  12. Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. Advances in psychology 52 (1988), 139--183.Google ScholarGoogle Scholar
  13. D K-h Ho. 2018. Voice-controlled virtual assistants for the older people with visual impairment. Eye, Article 32 (2018), 2 pages. http://dx.doi.org/10.1038/eye.2017.165Google ScholarGoogle Scholar
  14. Jonathan Holman, Jonathan Lazar, Jinjuan Heidi Feng, and John D'Arcy. 2007. Developing Usable CAPTCHAs for Blind Users. In Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility (Assets '07). ACM, New York, NY, USA, 245--246. http://dx.doi.org/10.1145/1296843.1296894Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. IBM. 2017. IBM Watson Speech to Text. (2017). Retrieved Jan 1, 2017 from https://www.ibm.com/watson/services/speech-to-text/Google ScholarGoogle Scholar
  16. Amazon Inc. 2017a. Amazon Mechanical Turk. (2017). Retrieved June 1, 2017 from https://www.mturk.com/mturk/welcomeGoogle ScholarGoogle Scholar
  17. Freedom Scientific Inc. 2017b. Blindness Solutions: JAWS. (2017). Retrieved Nov 5, 2017 from http://www.freedomscientific.com/Products/Blindness/JAWSGoogle ScholarGoogle Scholar
  18. Jonathan Lazar, Jinjuan Feng, Olusegun Adelegan, Anna Giller, Andrew Hardsock, Ron Horney, Ryan Jacob, Edward Kosiba, Gregory Martin, Monica Misterka, Ashley O'Connor, Andrew Pr"ack, Roland Roberts, Gabe Piunti, Robert Schober, Matt Weatherholtz, and Eric Weaver. 2010. POSTER: Assessing the Usability of the new Radio Clip-Based Human Interaction Proofs.Google ScholarGoogle Scholar
  19. Jonathan Lazar, Jinjuan Feng, Tim Brooks, Genna Melamed, Brian Wentz, Jon Holman, Abiodun Olalere, and Nnanna Ekedebe. 2012. The SoundsRight CAPTCHA: An Improved Approach to Audio Human Interaction Proofs for Blind Users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 2267--2276. http://dx.doi.org/10.1145/2207676.2208385Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Xiangyu Liu, Zhe Zhou, Wenrui Diao, Zhou Li, and Kehuan Zhang. 2015. When Good Becomes Evil: Keystroke Inference with Smartwatch. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). ACM, New York, NY, USA, 1273--1285. http://dx.doi.org/10.1145/2810103.2813668Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Hendrik Meutzner, Santosh Gupta, and Dorothea Kolossa. 2015. Constructing Secure Audio CAPTCHAs by Exploiting Differences Between Humans and Machines. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 2335--2338. http://dx.doi.org/10.1145/2702123.2702127Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Hendrik Meutzner, Santosh Gupta, Viet-Hung Nguyen, Thorsten Holz, and Dorothea Kolossa. 2016. Toward Improved Audio CAPTCHAs Based on Auditory Perception and Language Understanding. ACM Trans. Priv. Secur. 19, 4, Article 10 (Nov. 2016), 31 pages. http://dx.doi.org/10.1145/2856820Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Hendrik Meutzner, Viet-Hung Nguyen, Thorsten Holz, and Dorothea Kolossa. 2014. Using Automatic Speech Recognition for Attacking Acoustic CAPTCHAs: The Trade-off Between Usability and Security. In Proceedings of the 30th Annual Computer Security Applications Conference (ACSAC '14). ACM, New York, NY, USA, 276--285. http://dx.doi.org/10.1145/2664243.2664262Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. George A. Miller. 1995. WordNet: A Lexical Database for English. Commun. ACM 38, 11 (1995), 39--41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yair Mizrahi. 2017. ReBreakCaptcha: Breaking Google's ReCaptcha v2 using Google. (2017). Retrieved Mar 18, 2018 from https://east-ee.com/2017/02/28/rebreakcaptcha-breaking-googles-recaptcha-v2-using-google/Google ScholarGoogle Scholar
  26. NLTK Project. 2017. Natural Language Toolkit. (2017). Retrieved Jan 15, 2017 from http://www.nltk.org/Google ScholarGoogle Scholar
  27. E. S. Ristad and P. N. Yianilos. 1998. Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 5 (May 1998), 522--532. http://dx.doi.org/10.1109/34.682181Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Niharika Sachdeva, Nitesh Saxena, and Ponnurangam Kumaraguru. 2013. On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field Study. In Proceedings of the 11th Asia Pacific Conference on Computer Human Interaction (APCHI '13). ACM, New York, NY, USA, 178--182. http://dx.doi.org/10.1145/2525194.2525265Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Shotaro Sano, Takuma Otsuka, and Hiroshi G. Okuno. 2013. Solving Google's Continuous Audio CAPTCHA with HMM-Based Automatic Speech Recognition. In Advances in Information and Computer Security. Springer Berlin Heidelberg, Berlin, Heidelberg, 36--52.Google ScholarGoogle Scholar
  30. Graig Sauer, Harry Hochheiser, Jinjuan Feng, and Jonathan Lazar. 2008. Towards a Universally Usable CAPTCHA. In Proceedings of the 4th Symposium on Usability, Privacy and Security (SOUPS'08). ACM, New York, NY, USA, 2.Google ScholarGoogle Scholar
  31. Graig Sauer, Jonathan Holman, Jonathan Lazar, Harry Hochheiser, and Jinjuan Feng. 2010a. Accessible Privacy and Security: A Universally Usable Human-interaction Proof Tool. Univers. Access Inf. Soc. 9, 3 (Aug. 2010), 239--248. http://dx.doi.org/10.1007/s10209-009-0171--2Google ScholarGoogle ScholarCross RefCross Ref
  32. Graig Sauer, Jonathan Lazar, Harry Hochheiser, and Jinjuan Feng. 2010b. Towards A Universally Usable Human Interaction Proof: Evaluation of Task Completion Strategies. ACM Trans. Access. Comput. 2, 4, Article 15 (June 2010), 32 pages. http://dx.doi.org/10.1145/1786774.1786776Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Sajad Shirali-Shahreza, Gerald Penn, Ravin Balakrishnan, and Yashar Ganjali. 2013. SeeSay and HearSay CAPTCHA for Mobile Interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 2147--2156. http://dx.doi.org/10.1145/2470654.2481295Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Glen Shires. 2017. Voice Driven Web Apps: Introduction to the Web Speech API. (2017). Retrieved Jan 15, 2017 from https://developers.google.com/web/updates/2013/01/Voice-Driven-Web-Apps-Introduction-to-the-Web-Speech-APIGoogle ScholarGoogle Scholar
  35. Yannis Soupionis and Dimitris Gritzalis. 2010. Audio CAPTCHA: Existing Solutions Assessment and a New Implementation for VoIP Telephony. Comput. Secur. 29, 5 (July 2010), 603--618. http://dx.doi.org/10.1016/j.cose.2009.12.003Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Jennifer Tam, Sean Hyde, Jiri Simsa, and Luis Von Ahn. 2008a. Breaking Audio CAPTCHAs. In Proceedings of the 21st International Conference on Neural Information Processing Systems (NIPS'08). Curran Associates Inc., USA, 1625--1632. http://dl.acm.org/citation.cfm?id=2981780.2981983Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Jennifer Tam, Jiri Simsa, David Huggins-Daines, Luis Von Ahn, and Manuel Blum. 2008b. Improving Audio CAPTCHAs. In Proceedings of the 4th Symposium on Usability, Privacy and Security (SOUPS'08). ACM, New York, NY, USA, 2.Google ScholarGoogle Scholar
  38. Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2017. Respeak: A Voice-based, Crowd-powered Speech Transcription System. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 1855--1866. http://dx.doi.org/10.1145/3025453.3025640Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham, and Manuel Blum. 2008. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science 321, 5895 (2008), 1465--1468. http://dx.doi.org/10.1126/science.1160379Google ScholarGoogle Scholar
  40. Guoshen Yu, Stephane Mallat, and Emmanuel Bacry. 2008. Audio Denoising by Time-Frequency Block Thresholding. IEEE Transactions on Signal Processing 56, 5 (May 2008), 1830--1839. http://dx.doi.org/10.1109/TSP.2007.912893Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic Generation and Evaluation of Usable and Secure Audio reCAPTCHA

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      ASSETS '19: Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility
      October 2019
      730 pages
      ISBN:9781450366762
      DOI:10.1145/3308561

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 24 October 2019

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      ASSETS '19 Paper Acceptance Rate41of158submissions,26%Overall Acceptance Rate436of1,556submissions,28%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader