research-article

Automatic Generation and Evaluation of Usable and Secure Audio reCAPTCHA

Authors:
Mohit Jain

Microsoft Research, Bangalore, India

Microsoft Research, Bangalore, India
View Profile

,
Rohun Tripathi

Amazon GO, Seattle, WA, USA

Amazon GO, Seattle, WA, USA
View Profile

,
Ishita Bhansali

Infosys, Bangalore, India

Infosys, Bangalore, India
View Profile

,
Pratyush Kumar

Indian Institute of Technology Madras, Chennai, India

Indian Institute of Technology Madras, Chennai, India
View Profile

ASSETS '19: Proceedings of the 21st International ACM SIGACCESS Conference on Computers and AccessibilityOctober 2019Pages 355–366https://doi.org/10.1145/3308561.3353777

Published:24 October 2019Publication History

ASSETS '19: Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility

Pages 355–366

ABSTRACT

CAPTCHAs are challenge-response tests to differentiate humans from automated agents, with tasks that are easy for humans but difficult for computers. The most common CAPTCHAs require humans to decipher characters from an image and are unsuitable for visually impaired people. As an alternative, audio CAPTCHA was proposed, which require deciphering spoken digits/letters. However, current audio CAPTCHAs suffer from low usability and are insecure against Automatic Speech Recognition (ASR) attacks. In this work, we propose reCAPGen, a system that uses ASR for generating secure CAPTCHAs. We evaluated four audio CAPTCHA schemes with 60 sighted and 19 visually impaired participants. We found that our proposed Last Two Words scheme was the most usable with success rate of >78.2% and low response time of <14.5s. Furthermore, solving our audio CAPTCHAs can transcribe unknown words with >82% accuracy.

References

Luis Von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford. 2003. CAPTCHA: Using Hard AI Problems for Security. In Proceedings of the 22Nd International Conference on Theory and Applications of Cryptographic Techniques (EUROCRYPT'03). Springer-Verlag, Berlin, Heidelberg, 294--311. http://dl.acm.org/citation.cfm?id=1766171.1766196Google ScholarCross Ref
Internet Archive. 2017a. Old Time Radio. (2017). Retrieved Jan 4, 2017 from https://archive.org/details/oldtimeradioGoogle Scholar
Internet Archive. 2017b. Podcasts. (2017). Retrieved Jan 4, 2017 from https://archive.org/details/audio_podcastGoogle Scholar
Jeffrey P. Bigham and Anna C. Cavender. 2009. Evaluating Existing Audio CAPTCHAs and an Interface Optimized for Non-visual Use. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '09). ACM, New York, NY, USA, 1829--1838. http://dx.doi.org/10.1145/1518701.1518983Google ScholarDigital Library
Kevin Bock, Daven Patel, George Hughey, and Dave Levin. 2017. unCaptcha: A Low-Resource Defeat of reCaptchatextquoterights Audio Challenge. In 11th USENIX Workshop on Offensive Technologies (WOOT 17). USENIX Association, Vancouver, BC. https://www.usenix.org/conference/woot17/workshop-program/presentation/bockGoogle Scholar
E. Bursztein, R. Beauxis, H. Paskov, D. Perito, C. Fabry, and J. Mitchell. 2011. The Failure of Noise-Based Non-continuous Audio Captchas. In 2011 IEEE Symposium on Security and Privacy. 19--31. http://dx.doi.org/10.1109/SP.2011.14Google ScholarDigital Library
Elie Bursztein and Steven Bethard. 2009. Decaptcha: Breaking 75% of eBay Audio CAPTCHAs. In Proceedings of the 3rd USENIX Conference on Offensive Technologies (WOOT'09). USENIX Association, Berkeley, CA, USA, 8--8. http://dl.acm.org/citation.cfm?id=1855876.1855884Google Scholar
Elie Bursztein, Steven Bethard, Celine Fabry, John C. Mitchell, and Dan Jurafsky. 2010. How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation. In Proceedings of the 2010 IEEE Symposium on Security and Privacy (SP '10). IEEE Computer Society, Washington, DC, USA, 399--413. http://dx.doi.org/10.1109/SP.2010.31Google ScholarDigital Library
Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden Voice Commands. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, Austin, TX, 513--530. https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/carliniGoogle ScholarDigital Library
Google Cloud. 2017. Google Cloud Speech API. (2017). Retrieved Jan 1, 2017 from https://cloud.google.com/speech/Google Scholar
William Haack, Madeleine Severance, Michael Wallace, and Jeremy Wohlwend. 2017. Security Analysis of the Amazon Echo. (2017), 1--14.Google Scholar
Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. Advances in psychology 52 (1988), 139--183.Google Scholar
D K-h Ho. 2018. Voice-controlled virtual assistants for the older people with visual impairment. Eye, Article 32 (2018), 2 pages. http://dx.doi.org/10.1038/eye.2017.165Google Scholar
Jonathan Holman, Jonathan Lazar, Jinjuan Heidi Feng, and John D'Arcy. 2007. Developing Usable CAPTCHAs for Blind Users. In Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility (Assets '07). ACM, New York, NY, USA, 245--246. http://dx.doi.org/10.1145/1296843.1296894Google ScholarDigital Library
IBM. 2017. IBM Watson Speech to Text. (2017). Retrieved Jan 1, 2017 from https://www.ibm.com/watson/services/speech-to-text/Google Scholar
Amazon Inc. 2017a. Amazon Mechanical Turk. (2017). Retrieved June 1, 2017 from https://www.mturk.com/mturk/welcomeGoogle Scholar
Freedom Scientific Inc. 2017b. Blindness Solutions: JAWS. (2017). Retrieved Nov 5, 2017 from http://www.freedomscientific.com/Products/Blindness/JAWSGoogle Scholar
Jonathan Lazar, Jinjuan Feng, Olusegun Adelegan, Anna Giller, Andrew Hardsock, Ron Horney, Ryan Jacob, Edward Kosiba, Gregory Martin, Monica Misterka, Ashley O'Connor, Andrew Pr"ack, Roland Roberts, Gabe Piunti, Robert Schober, Matt Weatherholtz, and Eric Weaver. 2010. POSTER: Assessing the Usability of the new Radio Clip-Based Human Interaction Proofs.Google Scholar
Jonathan Lazar, Jinjuan Feng, Tim Brooks, Genna Melamed, Brian Wentz, Jon Holman, Abiodun Olalere, and Nnanna Ekedebe. 2012. The SoundsRight CAPTCHA: An Improved Approach to Audio Human Interaction Proofs for Blind Users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 2267--2276. http://dx.doi.org/10.1145/2207676.2208385Google ScholarDigital Library
Xiangyu Liu, Zhe Zhou, Wenrui Diao, Zhou Li, and Kehuan Zhang. 2015. When Good Becomes Evil: Keystroke Inference with Smartwatch. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). ACM, New York, NY, USA, 1273--1285. http://dx.doi.org/10.1145/2810103.2813668Google ScholarDigital Library
Hendrik Meutzner, Santosh Gupta, and Dorothea Kolossa. 2015. Constructing Secure Audio CAPTCHAs by Exploiting Differences Between Humans and Machines. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 2335--2338. http://dx.doi.org/10.1145/2702123.2702127Google ScholarDigital Library
Hendrik Meutzner, Santosh Gupta, Viet-Hung Nguyen, Thorsten Holz, and Dorothea Kolossa. 2016. Toward Improved Audio CAPTCHAs Based on Auditory Perception and Language Understanding. ACM Trans. Priv. Secur. 19, 4, Article 10 (Nov. 2016), 31 pages. http://dx.doi.org/10.1145/2856820Google ScholarDigital Library
Hendrik Meutzner, Viet-Hung Nguyen, Thorsten Holz, and Dorothea Kolossa. 2014. Using Automatic Speech Recognition for Attacking Acoustic CAPTCHAs: The Trade-off Between Usability and Security. In Proceedings of the 30th Annual Computer Security Applications Conference (ACSAC '14). ACM, New York, NY, USA, 276--285. http://dx.doi.org/10.1145/2664243.2664262Google ScholarDigital Library
George A. Miller. 1995. WordNet: A Lexical Database for English. Commun. ACM 38, 11 (1995), 39--41.Google ScholarDigital Library
Yair Mizrahi. 2017. ReBreakCaptcha: Breaking Google's ReCaptcha v2 using Google. (2017). Retrieved Mar 18, 2018 from https://east-ee.com/2017/02/28/rebreakcaptcha-breaking-googles-recaptcha-v2-using-google/Google Scholar
NLTK Project. 2017. Natural Language Toolkit. (2017). Retrieved Jan 15, 2017 from http://www.nltk.org/Google Scholar
E. S. Ristad and P. N. Yianilos. 1998. Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 5 (May 1998), 522--532. http://dx.doi.org/10.1109/34.682181Google ScholarDigital Library
Niharika Sachdeva, Nitesh Saxena, and Ponnurangam Kumaraguru. 2013. On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field Study. In Proceedings of the 11th Asia Pacific Conference on Computer Human Interaction (APCHI '13). ACM, New York, NY, USA, 178--182. http://dx.doi.org/10.1145/2525194.2525265Google ScholarDigital Library
Shotaro Sano, Takuma Otsuka, and Hiroshi G. Okuno. 2013. Solving Google's Continuous Audio CAPTCHA with HMM-Based Automatic Speech Recognition. In Advances in Information and Computer Security. Springer Berlin Heidelberg, Berlin, Heidelberg, 36--52.Google Scholar
Graig Sauer, Harry Hochheiser, Jinjuan Feng, and Jonathan Lazar. 2008. Towards a Universally Usable CAPTCHA. In Proceedings of the 4th Symposium on Usability, Privacy and Security (SOUPS'08). ACM, New York, NY, USA, 2.Google Scholar
Graig Sauer, Jonathan Holman, Jonathan Lazar, Harry Hochheiser, and Jinjuan Feng. 2010a. Accessible Privacy and Security: A Universally Usable Human-interaction Proof Tool. Univers. Access Inf. Soc. 9, 3 (Aug. 2010), 239--248. http://dx.doi.org/10.1007/s10209-009-0171--2Google ScholarCross Ref
Graig Sauer, Jonathan Lazar, Harry Hochheiser, and Jinjuan Feng. 2010b. Towards A Universally Usable Human Interaction Proof: Evaluation of Task Completion Strategies. ACM Trans. Access. Comput. 2, 4, Article 15 (June 2010), 32 pages. http://dx.doi.org/10.1145/1786774.1786776Google ScholarDigital Library
Sajad Shirali-Shahreza, Gerald Penn, Ravin Balakrishnan, and Yashar Ganjali. 2013. SeeSay and HearSay CAPTCHA for Mobile Interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 2147--2156. http://dx.doi.org/10.1145/2470654.2481295Google ScholarDigital Library
Glen Shires. 2017. Voice Driven Web Apps: Introduction to the Web Speech API. (2017). Retrieved Jan 15, 2017 from https://developers.google.com/web/updates/2013/01/Voice-Driven-Web-Apps-Introduction-to-the-Web-Speech-APIGoogle Scholar
Yannis Soupionis and Dimitris Gritzalis. 2010. Audio CAPTCHA: Existing Solutions Assessment and a New Implementation for VoIP Telephony. Comput. Secur. 29, 5 (July 2010), 603--618. http://dx.doi.org/10.1016/j.cose.2009.12.003Google ScholarDigital Library
Jennifer Tam, Sean Hyde, Jiri Simsa, and Luis Von Ahn. 2008a. Breaking Audio CAPTCHAs. In Proceedings of the 21st International Conference on Neural Information Processing Systems (NIPS'08). Curran Associates Inc., USA, 1625--1632. http://dl.acm.org/citation.cfm?id=2981780.2981983Google ScholarDigital Library
Jennifer Tam, Jiri Simsa, David Huggins-Daines, Luis Von Ahn, and Manuel Blum. 2008b. Improving Audio CAPTCHAs. In Proceedings of the 4th Symposium on Usability, Privacy and Security (SOUPS'08). ACM, New York, NY, USA, 2.Google Scholar
Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2017. Respeak: A Voice-based, Crowd-powered Speech Transcription System. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 1855--1866. http://dx.doi.org/10.1145/3025453.3025640Google ScholarDigital Library
Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham, and Manuel Blum. 2008. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science 321, 5895 (2008), 1465--1468. http://dx.doi.org/10.1126/science.1160379Google Scholar
Guoshen Yu, Stephane Mallat, and Emmanuel Bacry. 2008. Audio Denoising by Time-Frequency Block Thresholding. IEEE Transactions on Signal Processing 56, 5 (May 2008), 1830--1839. http://dx.doi.org/10.1109/TSP.2007.912893Google ScholarDigital Library

Index Terms

Automatic Generation and Evaluation of Usable and Secure Audio reCAPTCHA
1. Human-centered computing
  1. Accessibility

Recommendations

Groupware Accessibility for Persons with Disabilities
UAHCI '09: Proceedings of the 5th International Conference on Universal Access in Human-Computer Interaction. Part III: Applications and Services

The accessibility issues of Groupware applications prevent visually impaired and other persons with disabilities access to these highly graphical interfaces. To address the accessibility issues persons with disabilities have with Groupware, a recent ...
Read More
Printed Book to Audio Book Converter for Visually Impaired
TIIEC '13: Proceedings of the 2013 Texas Instruments India Educators' Conference

Visually impaired people are dependent solely on Braille books & audio recordings provided by NGOs. Owing to many constraints in above two approaches blind people can't have book of their choice. The presented work will provide them an opportunity to ...
Read More
The evaluation of visually impaired people's ability of defining the object location on touch-screen
ASSETS '10: Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibility

Touch-screen has been used not only on home appliances but also on so many kinds of machines in public facilities at the present time. However the fact is that most of visually impaired people have a problem with the difficult usability of a touch-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ASSETS '19: Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility
October 2019
730 pages
ISBN:9781450366762
DOI:10.1145/3308561
General Chair:
Jeffrey P. Bigham
Carnegie Mellon University., USA
,
Program Chairs:
Shiri Azenkot
Cornell Tech, USA
,
Shaun K. Kane
University of Colorado Boulder, USA
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 24 October 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
blind
captcha
evaluation
mturk
visually impaired
Qualifiers
- research-article
Conference

Acceptance Rates
ASSETS '19 Paper Acceptance Rate41of158submissions,26%Overall Acceptance Rate436of1,556submissions,28%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 10
  Total Citations
  View Citations
- 307
  Total Downloads
- Downloads (Last 12 months)41
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic Generation and Evaluation of Usable and Secure Audio reCAPTCHA

ASSETS '19: Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility

ABSTRACT

References

Cited By

Index Terms

Recommendations

Groupware Accessibility for Persons with Disabilities

Printed Book to Audio Book Converter for Visually Impaired

The evaluation of visually impaired people's ability of defining the object location on touch-screen