ABSTRACT
The lack of access to visual information like text labels, icons, and colors can cause frustration and decrease independence for blind people. Current access technology uses automatic approaches to address some problems in this space, but the technology is error-prone, limited in scope, and quite expensive. In this paper, we introduce VizWiz, a talking application for mobile phones that offers a new alternative to answering visual questions in nearly real-time - asking multiple people on the web. To support answering questions quickly, we introduce a general approach for intelligently recruiting human workers in advance called quikTurkit so that workers are available when new questions arrive. A field deployment with 11 blind participants illustrates that blind people can effectively use VizWiz to cheaply answer questions in their everyday lives, highlighting issues that automatic approaches will need to address to be useful. Finally, we illustrate the potential of using VizWiz as part of the participatory design of advanced tools by using it to build and evaluate VizWiz::LocateIt, an interactive mobile tool that helps blind people solve general visual search problems.
- }}Amazon Mechanical Turk. http://www.mturk.com/. 2010.Google Scholar
- }}Amazon Remembers. http://www.amazon.com/gp/. 2010.Google Scholar
- }}Bay, H., A. Ess, T. Tuytelaars, and L. Van Gool. Surf: Speeded up robust features. Proc. of CVIU 2008, v. 110, 346--359, 2008. Google ScholarDigital Library
- }}Blind with camera: Changing lives with photography. http://blindwithcamera.org/. 2009.Google Scholar
- }}Chacha. http://www.chacha.com/. 2010.Google Scholar
- }}Eyes-free. http://code.google.com/p/eyes-free/. 2010.Google Scholar
- }}Gifford, S., J. Knox, J. James, and A. Prakash. Introduction to the talking points project. Proc. of ASSETS 2008, 271--272, 2008. Google ScholarDigital Library
- }}Google Goggles, 2010. http://www.google.com/mobile/goggles/.Google Scholar
- }}Hong, D., S. Kimmel, R. Boehling, N. Camoriano, W. Cardwell, G. Jannaman, A. Purcell, D. Ross, an, E. Russel. Development of a semi-autonomous vehicle operable by the visually-impaired. IEEE Intl. Conf. on Multisensor Fusion and Integration for Intelligent Systems, 539--544, 2008.Google ScholarCross Ref
- }}Hsueh, P., P. Melville, and V. Sindhwani. Data quality from crowdsourcing: a study of annotation selection criteria. Proc. of the HLT 2009 Workshop on Active Learning for NLP, 27--35, 2009. Google ScholarDigital Library
- }}Intel reader. http://www.intel.com/healthcare/reader/. 2009.Google Scholar
- }}Kane, S. K., J. P. Bigham, and J. O. Wobbrock. Slide rule: making mobile touch screens accessible to blind people using multi-touch interaction techniques. ASSETS 2008, 73--80, 2008. Google ScholarDigital Library
- }}Kane, S. K., C. Jayant, J. O. Wobbrock, and R. E. Ladner. Freedom to roam: a study of mobile device adoption and accessibility for people with visual and motor disabilities. ASSETS 2009, 115--122, 2009. Google ScholarDigital Library
- }}KGB, 2010. http://www.kgb.com.Google Scholar
- }}Kittur A., E. H. Chi, and B. Suh. Crowdsourcing user studies with mechanical turk. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems (CHI 2008), pages 453--456, 2008. Google ScholarDigital Library
- }}16. kNFB reader. knfb Reading Technology, Inc., 2008. http://www.knfbreader.com/.Google Scholar
- }}Knocking live video. ustream, 2010. http://knockinglive.com/.Google Scholar
- }}Ko, J., and C. Kim. Low cost blur image detection and estimation for mobile devices. ICACT 2009, 1605--1610, 2009. Google ScholarDigital Library
- }}Little, G., L. Chilton, M. Goldman, and R. C. Miller. TurKit: Human Computation Algorithms on Mechanical Turk. UIST 2010, 2010. Google ScholarDigital Library
- }}Liu, X. A camera phone based currency reader for the visually impaired. ASSETS 2008, 305--306, 2008. Google ScholarDigital Library
- }}Looktel, 2010. http://www.looktel.com.Google Scholar
- }}Matthews, T., S. Carter, C. Pai, J. Fong, and J. Mankoff. Scribe4me: Evaluating a mobile sound transcription tool for the deaf. UbiComp 2006, 159--176, 2006. Google ScholarDigital Library
- }}Matthews, T., J. Fong, F. W.-L. Ho-Ching, and J. Mankoff. Evaluating visualizations of non-speech sounds for the deaf. Behavior and Information Technology, 25(4):333--351, 2006.Google ScholarCross Ref
- }}Miniguide us. http://www.gdp-research.com.au/minig_4.htm/.Google Scholar
- }}Mobile speak screen readers. Code Factory, 2008. http://www.codefactory.es/en/products.asp?id=16.Google Scholar
- }}Ringel-Morris, M., J. Teevan, and K. Panovich. What do people ask their social networks, and why? a survey study of status message q&a behavior. CHI 2010, 1739--1748, 2010. Google ScholarDigital Library
- }}Power, M. R., Power, D., and Horstmanshof, L. Deaf people communicating via sms, tty, relay service, fax, and computers in australia. Journal of Deaf Studies and Deaf Education, v. 12, i. 1, 2006.Google Scholar
- }}Rangin, H. B. Anatomy of a large-scale social search engine. WWW 2010), 431--440, 2010. Google ScholarDigital Library
- }}Solona, 2010. http://www.solona.net/.Google Scholar
- }}Sorokin, A., and D. Forsyth. Utility data annotation with amazon mechanical turk. CVPRW 2008, 1--8, 2008.Google ScholarCross Ref
- }}Takagi, H., S. Kawanaka, M. Kobayashi, T. Itoh, and C. Asakawa. Social accessibility: achieving accessibility through collaborative metadata authoring. ASSETS 2008, 193--200, 2008. Google ScholarDigital Library
- }}Talking signs. http://www.talkingsigns.com/, 2008.Google Scholar
- }}Testscout- your mobile reader, 2010. http://www.textscout.eu/en/.Google Scholar
- }}Lanigan, P., A. M. Paulos, A. W. Williams, and P. Narasimhan. Trinetra: Assistive Technologies for the Blind. Carnegie Mellon University, CyLab, 2006.Google Scholar
- }}UStream. ustream, 2010. http://www.ustream.tv/.Google Scholar
- }}Voiceover: Macintosh OS X, 2007. http://www.apple.com/accessibility/voiceover/.Google Scholar
- }}voice for android, 2010. www.seeingwithsound.com/android.htm.Google Scholar
- }}von Ahn, L., and L. Dabbish. Labeling images with a computer game. CHI 2004, 319--326, 2004. Google ScholarDigital Library
- }}Yeh, T., J. J. Lee, and T. Darrell. Photo-based question answering. MM 2008, 389--398, 2008. Google ScholarDigital Library
Index Terms
- VizWiz: nearly real-time answers to visual questions
Recommendations
Visual challenges in the everyday lives of blind people
CHI '13: Proceedings of the SIGCHI Conference on Human Factors in Computing SystemsThe challenges faced by blind people in their everyday lives are not well understood. In this paper, we report on the findings of a large-scale study of the visual questions that blind people would like to have answered. As part of this year-long study, ...
VizWiz: nearly real-time answers to visual questions
W4A '10: Proceedings of the 2010 International Cross Disciplinary Conference on Web Accessibility (W4A)Visual information pervades our environment. Vision is used to decide everything from what we want to eat at a restaurant and which bus route to take to whether our clothes match and how long until the milk expires. Individually, the inability to ...
Crowdsourcing subjective fashion advice using VizWiz: challenges and opportunities
ASSETS '12: Proceedings of the 14th international ACM SIGACCESS conference on Computers and accessibilityFashion is a language. How we dress signals to others who we are and how we want to be perceived. However, this language is primarily visual, making it inaccessible to people with vision impairments. Someone who is low-vision or completely blind cannot ...
Comments