ABSTRACT
CAPTCHAs are challenge-response tests to differentiate humans from automated agents, with tasks that are easy for humans but difficult for computers. The most common CAPTCHAs require humans to decipher characters from an image and are unsuitable for visually impaired people. As an alternative, audio CAPTCHA was proposed, which require deciphering spoken digits/letters. However, current audio CAPTCHAs suffer from low usability and are insecure against Automatic Speech Recognition (ASR) attacks. In this work, we propose reCAPGen, a system that uses ASR for generating secure CAPTCHAs. We evaluated four audio CAPTCHA schemes with 60 sighted and 19 visually impaired participants. We found that our proposed Last Two Words scheme was the most usable with success rate of >78.2% and low response time of <14.5s. Furthermore, solving our audio CAPTCHAs can transcribe unknown words with >82% accuracy.
- Luis Von Ahn, Manuel Blum, Nicholas J. Hopper, and John Langford. 2003. CAPTCHA: Using Hard AI Problems for Security. In Proceedings of the 22Nd International Conference on Theory and Applications of Cryptographic Techniques (EUROCRYPT'03). Springer-Verlag, Berlin, Heidelberg, 294--311. http://dl.acm.org/citation.cfm?id=1766171.1766196Google ScholarCross Ref
- Internet Archive. 2017a. Old Time Radio. (2017). Retrieved Jan 4, 2017 from https://archive.org/details/oldtimeradioGoogle Scholar
- Internet Archive. 2017b. Podcasts. (2017). Retrieved Jan 4, 2017 from https://archive.org/details/audio_podcastGoogle Scholar
- Jeffrey P. Bigham and Anna C. Cavender. 2009. Evaluating Existing Audio CAPTCHAs and an Interface Optimized for Non-visual Use. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '09). ACM, New York, NY, USA, 1829--1838. http://dx.doi.org/10.1145/1518701.1518983Google ScholarDigital Library
- Kevin Bock, Daven Patel, George Hughey, and Dave Levin. 2017. unCaptcha: A Low-Resource Defeat of reCaptchatextquoterights Audio Challenge. In 11th USENIX Workshop on Offensive Technologies (WOOT 17). USENIX Association, Vancouver, BC. https://www.usenix.org/conference/woot17/workshop-program/presentation/bockGoogle Scholar
- E. Bursztein, R. Beauxis, H. Paskov, D. Perito, C. Fabry, and J. Mitchell. 2011. The Failure of Noise-Based Non-continuous Audio Captchas. In 2011 IEEE Symposium on Security and Privacy. 19--31. http://dx.doi.org/10.1109/SP.2011.14Google ScholarDigital Library
- Elie Bursztein and Steven Bethard. 2009. Decaptcha: Breaking 75% of eBay Audio CAPTCHAs. In Proceedings of the 3rd USENIX Conference on Offensive Technologies (WOOT'09). USENIX Association, Berkeley, CA, USA, 8--8. http://dl.acm.org/citation.cfm?id=1855876.1855884Google Scholar
- Elie Bursztein, Steven Bethard, Celine Fabry, John C. Mitchell, and Dan Jurafsky. 2010. How Good Are Humans at Solving CAPTCHAs? A Large Scale Evaluation. In Proceedings of the 2010 IEEE Symposium on Security and Privacy (SP '10). IEEE Computer Society, Washington, DC, USA, 399--413. http://dx.doi.org/10.1109/SP.2010.31Google ScholarDigital Library
- Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden Voice Commands. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, Austin, TX, 513--530. https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/carliniGoogle ScholarDigital Library
- Google Cloud. 2017. Google Cloud Speech API. (2017). Retrieved Jan 1, 2017 from https://cloud.google.com/speech/Google Scholar
- William Haack, Madeleine Severance, Michael Wallace, and Jeremy Wohlwend. 2017. Security Analysis of the Amazon Echo. (2017), 1--14.Google Scholar
- Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. Advances in psychology 52 (1988), 139--183.Google Scholar
- D K-h Ho. 2018. Voice-controlled virtual assistants for the older people with visual impairment. Eye, Article 32 (2018), 2 pages. http://dx.doi.org/10.1038/eye.2017.165Google Scholar
- Jonathan Holman, Jonathan Lazar, Jinjuan Heidi Feng, and John D'Arcy. 2007. Developing Usable CAPTCHAs for Blind Users. In Proceedings of the 9th International ACM SIGACCESS Conference on Computers and Accessibility (Assets '07). ACM, New York, NY, USA, 245--246. http://dx.doi.org/10.1145/1296843.1296894Google ScholarDigital Library
- IBM. 2017. IBM Watson Speech to Text. (2017). Retrieved Jan 1, 2017 from https://www.ibm.com/watson/services/speech-to-text/Google Scholar
- Amazon Inc. 2017a. Amazon Mechanical Turk. (2017). Retrieved June 1, 2017 from https://www.mturk.com/mturk/welcomeGoogle Scholar
- Freedom Scientific Inc. 2017b. Blindness Solutions: JAWS. (2017). Retrieved Nov 5, 2017 from http://www.freedomscientific.com/Products/Blindness/JAWSGoogle Scholar
- Jonathan Lazar, Jinjuan Feng, Olusegun Adelegan, Anna Giller, Andrew Hardsock, Ron Horney, Ryan Jacob, Edward Kosiba, Gregory Martin, Monica Misterka, Ashley O'Connor, Andrew Pr"ack, Roland Roberts, Gabe Piunti, Robert Schober, Matt Weatherholtz, and Eric Weaver. 2010. POSTER: Assessing the Usability of the new Radio Clip-Based Human Interaction Proofs.Google Scholar
- Jonathan Lazar, Jinjuan Feng, Tim Brooks, Genna Melamed, Brian Wentz, Jon Holman, Abiodun Olalere, and Nnanna Ekedebe. 2012. The SoundsRight CAPTCHA: An Improved Approach to Audio Human Interaction Proofs for Blind Users. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '12). ACM, New York, NY, USA, 2267--2276. http://dx.doi.org/10.1145/2207676.2208385Google ScholarDigital Library
- Xiangyu Liu, Zhe Zhou, Wenrui Diao, Zhou Li, and Kehuan Zhang. 2015. When Good Becomes Evil: Keystroke Inference with Smartwatch. In Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security (CCS '15). ACM, New York, NY, USA, 1273--1285. http://dx.doi.org/10.1145/2810103.2813668Google ScholarDigital Library
- Hendrik Meutzner, Santosh Gupta, and Dorothea Kolossa. 2015. Constructing Secure Audio CAPTCHAs by Exploiting Differences Between Humans and Machines. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI '15). ACM, New York, NY, USA, 2335--2338. http://dx.doi.org/10.1145/2702123.2702127Google ScholarDigital Library
- Hendrik Meutzner, Santosh Gupta, Viet-Hung Nguyen, Thorsten Holz, and Dorothea Kolossa. 2016. Toward Improved Audio CAPTCHAs Based on Auditory Perception and Language Understanding. ACM Trans. Priv. Secur. 19, 4, Article 10 (Nov. 2016), 31 pages. http://dx.doi.org/10.1145/2856820Google ScholarDigital Library
- Hendrik Meutzner, Viet-Hung Nguyen, Thorsten Holz, and Dorothea Kolossa. 2014. Using Automatic Speech Recognition for Attacking Acoustic CAPTCHAs: The Trade-off Between Usability and Security. In Proceedings of the 30th Annual Computer Security Applications Conference (ACSAC '14). ACM, New York, NY, USA, 276--285. http://dx.doi.org/10.1145/2664243.2664262Google ScholarDigital Library
- George A. Miller. 1995. WordNet: A Lexical Database for English. Commun. ACM 38, 11 (1995), 39--41.Google ScholarDigital Library
- Yair Mizrahi. 2017. ReBreakCaptcha: Breaking Google's ReCaptcha v2 using Google. (2017). Retrieved Mar 18, 2018 from https://east-ee.com/2017/02/28/rebreakcaptcha-breaking-googles-recaptcha-v2-using-google/Google Scholar
- NLTK Project. 2017. Natural Language Toolkit. (2017). Retrieved Jan 15, 2017 from http://www.nltk.org/Google Scholar
- E. S. Ristad and P. N. Yianilos. 1998. Learning string-edit distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 5 (May 1998), 522--532. http://dx.doi.org/10.1109/34.682181Google ScholarDigital Library
- Niharika Sachdeva, Nitesh Saxena, and Ponnurangam Kumaraguru. 2013. On the Viability of CAPTCHAs for Use in Telephony Systems: A Usability Field Study. In Proceedings of the 11th Asia Pacific Conference on Computer Human Interaction (APCHI '13). ACM, New York, NY, USA, 178--182. http://dx.doi.org/10.1145/2525194.2525265Google ScholarDigital Library
- Shotaro Sano, Takuma Otsuka, and Hiroshi G. Okuno. 2013. Solving Google's Continuous Audio CAPTCHA with HMM-Based Automatic Speech Recognition. In Advances in Information and Computer Security. Springer Berlin Heidelberg, Berlin, Heidelberg, 36--52.Google Scholar
- Graig Sauer, Harry Hochheiser, Jinjuan Feng, and Jonathan Lazar. 2008. Towards a Universally Usable CAPTCHA. In Proceedings of the 4th Symposium on Usability, Privacy and Security (SOUPS'08). ACM, New York, NY, USA, 2.Google Scholar
- Graig Sauer, Jonathan Holman, Jonathan Lazar, Harry Hochheiser, and Jinjuan Feng. 2010a. Accessible Privacy and Security: A Universally Usable Human-interaction Proof Tool. Univers. Access Inf. Soc. 9, 3 (Aug. 2010), 239--248. http://dx.doi.org/10.1007/s10209-009-0171--2Google ScholarCross Ref
- Graig Sauer, Jonathan Lazar, Harry Hochheiser, and Jinjuan Feng. 2010b. Towards A Universally Usable Human Interaction Proof: Evaluation of Task Completion Strategies. ACM Trans. Access. Comput. 2, 4, Article 15 (June 2010), 32 pages. http://dx.doi.org/10.1145/1786774.1786776Google ScholarDigital Library
- Sajad Shirali-Shahreza, Gerald Penn, Ravin Balakrishnan, and Yashar Ganjali. 2013. SeeSay and HearSay CAPTCHA for Mobile Interaction. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '13). ACM, New York, NY, USA, 2147--2156. http://dx.doi.org/10.1145/2470654.2481295Google ScholarDigital Library
- Glen Shires. 2017. Voice Driven Web Apps: Introduction to the Web Speech API. (2017). Retrieved Jan 15, 2017 from https://developers.google.com/web/updates/2013/01/Voice-Driven-Web-Apps-Introduction-to-the-Web-Speech-APIGoogle Scholar
- Yannis Soupionis and Dimitris Gritzalis. 2010. Audio CAPTCHA: Existing Solutions Assessment and a New Implementation for VoIP Telephony. Comput. Secur. 29, 5 (July 2010), 603--618. http://dx.doi.org/10.1016/j.cose.2009.12.003Google ScholarDigital Library
- Jennifer Tam, Sean Hyde, Jiri Simsa, and Luis Von Ahn. 2008a. Breaking Audio CAPTCHAs. In Proceedings of the 21st International Conference on Neural Information Processing Systems (NIPS'08). Curran Associates Inc., USA, 1625--1632. http://dl.acm.org/citation.cfm?id=2981780.2981983Google ScholarDigital Library
- Jennifer Tam, Jiri Simsa, David Huggins-Daines, Luis Von Ahn, and Manuel Blum. 2008b. Improving Audio CAPTCHAs. In Proceedings of the 4th Symposium on Usability, Privacy and Security (SOUPS'08). ACM, New York, NY, USA, 2.Google Scholar
- Aditya Vashistha, Pooja Sethi, and Richard Anderson. 2017. Respeak: A Voice-based, Crowd-powered Speech Transcription System. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (CHI '17). ACM, New York, NY, USA, 1855--1866. http://dx.doi.org/10.1145/3025453.3025640Google ScholarDigital Library
- Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham, and Manuel Blum. 2008. reCAPTCHA: Human-Based Character Recognition via Web Security Measures. Science 321, 5895 (2008), 1465--1468. http://dx.doi.org/10.1126/science.1160379Google Scholar
- Guoshen Yu, Stephane Mallat, and Emmanuel Bacry. 2008. Audio Denoising by Time-Frequency Block Thresholding. IEEE Transactions on Signal Processing 56, 5 (May 2008), 1830--1839. http://dx.doi.org/10.1109/TSP.2007.912893Google ScholarDigital Library
Index Terms
- Automatic Generation and Evaluation of Usable and Secure Audio reCAPTCHA
Recommendations
Groupware Accessibility for Persons with Disabilities
UAHCI '09: Proceedings of the 5th International Conference on Universal Access in Human-Computer Interaction. Part III: Applications and ServicesThe accessibility issues of Groupware applications prevent visually impaired and other persons with disabilities access to these highly graphical interfaces. To address the accessibility issues persons with disabilities have with Groupware, a recent ...
Printed Book to Audio Book Converter for Visually Impaired
TIIEC '13: Proceedings of the 2013 Texas Instruments India Educators' ConferenceVisually impaired people are dependent solely on Braille books & audio recordings provided by NGOs. Owing to many constraints in above two approaches blind people can't have book of their choice. The presented work will provide them an opportunity to ...
The evaluation of visually impaired people's ability of defining the object location on touch-screen
ASSETS '10: Proceedings of the 12th international ACM SIGACCESS conference on Computers and accessibilityTouch-screen has been used not only on home appliances but also on so many kinds of machines in public facilities at the present time. However the fact is that most of visually impaired people have a problem with the difficult usability of a touch-...
Comments