skip to main content
10.1145/1866029.1866080acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article

VizWiz: nearly real-time answers to visual questions

Authors Info & Claims
Published:03 October 2010Publication History

ABSTRACT

The lack of access to visual information like text labels, icons, and colors can cause frustration and decrease independence for blind people. Current access technology uses automatic approaches to address some problems in this space, but the technology is error-prone, limited in scope, and quite expensive. In this paper, we introduce VizWiz, a talking application for mobile phones that offers a new alternative to answering visual questions in nearly real-time - asking multiple people on the web. To support answering questions quickly, we introduce a general approach for intelligently recruiting human workers in advance called quikTurkit so that workers are available when new questions arrive. A field deployment with 11 blind participants illustrates that blind people can effectively use VizWiz to cheaply answer questions in their everyday lives, highlighting issues that automatic approaches will need to address to be useful. Finally, we illustrate the potential of using VizWiz as part of the participatory design of advanced tools by using it to build and evaluate VizWiz::LocateIt, an interactive mobile tool that helps blind people solve general visual search problems.

References

  1. }}Amazon Mechanical Turk. http://www.mturk.com/. 2010.Google ScholarGoogle Scholar
  2. }}Amazon Remembers. http://www.amazon.com/gp/. 2010.Google ScholarGoogle Scholar
  3. }}Bay, H., A. Ess, T. Tuytelaars, and L. Van Gool. Surf: Speeded up robust features. Proc. of CVIU 2008, v. 110, 346--359, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. }}Blind with camera: Changing lives with photography. http://blindwithcamera.org/. 2009.Google ScholarGoogle Scholar
  5. }}Chacha. http://www.chacha.com/. 2010.Google ScholarGoogle Scholar
  6. }}Eyes-free. http://code.google.com/p/eyes-free/. 2010.Google ScholarGoogle Scholar
  7. }}Gifford, S., J. Knox, J. James, and A. Prakash. Introduction to the talking points project. Proc. of ASSETS 2008, 271--272, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. }}Google Goggles, 2010. http://www.google.com/mobile/goggles/.Google ScholarGoogle Scholar
  9. }}Hong, D., S. Kimmel, R. Boehling, N. Camoriano, W. Cardwell, G. Jannaman, A. Purcell, D. Ross, an, E. Russel. Development of a semi-autonomous vehicle operable by the visually-impaired. IEEE Intl. Conf. on Multisensor Fusion and Integration for Intelligent Systems, 539--544, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  10. }}Hsueh, P., P. Melville, and V. Sindhwani. Data quality from crowdsourcing: a study of annotation selection criteria. Proc. of the HLT 2009 Workshop on Active Learning for NLP, 27--35, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. }}Intel reader. http://www.intel.com/healthcare/reader/. 2009.Google ScholarGoogle Scholar
  12. }}Kane, S. K., J. P. Bigham, and J. O. Wobbrock. Slide rule: making mobile touch screens accessible to blind people using multi-touch interaction techniques. ASSETS 2008, 73--80, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. }}Kane, S. K., C. Jayant, J. O. Wobbrock, and R. E. Ladner. Freedom to roam: a study of mobile device adoption and accessibility for people with visual and motor disabilities. ASSETS 2009, 115--122, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. }}KGB, 2010. http://www.kgb.com.Google ScholarGoogle Scholar
  15. }}Kittur A., E. H. Chi, and B. Suh. Crowdsourcing user studies with mechanical turk. In Proc. of the SIGCHI Conf. on Human Factors in Computing Systems (CHI 2008), pages 453--456, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. }}16. kNFB reader. knfb Reading Technology, Inc., 2008. http://www.knfbreader.com/.Google ScholarGoogle Scholar
  17. }}Knocking live video. ustream, 2010. http://knockinglive.com/.Google ScholarGoogle Scholar
  18. }}Ko, J., and C. Kim. Low cost blur image detection and estimation for mobile devices. ICACT 2009, 1605--1610, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. }}Little, G., L. Chilton, M. Goldman, and R. C. Miller. TurKit: Human Computation Algorithms on Mechanical Turk. UIST 2010, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. }}Liu, X. A camera phone based currency reader for the visually impaired. ASSETS 2008, 305--306, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. }}Looktel, 2010. http://www.looktel.com.Google ScholarGoogle Scholar
  22. }}Matthews, T., S. Carter, C. Pai, J. Fong, and J. Mankoff. Scribe4me: Evaluating a mobile sound transcription tool for the deaf. UbiComp 2006, 159--176, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. }}Matthews, T., J. Fong, F. W.-L. Ho-Ching, and J. Mankoff. Evaluating visualizations of non-speech sounds for the deaf. Behavior and Information Technology, 25(4):333--351, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  24. }}Miniguide us. http://www.gdp-research.com.au/minig_4.htm/.Google ScholarGoogle Scholar
  25. }}Mobile speak screen readers. Code Factory, 2008. http://www.codefactory.es/en/products.asp?id=16.Google ScholarGoogle Scholar
  26. }}Ringel-Morris, M., J. Teevan, and K. Panovich. What do people ask their social networks, and why? a survey study of status message q&a behavior. CHI 2010, 1739--1748, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. }}Power, M. R., Power, D., and Horstmanshof, L. Deaf people communicating via sms, tty, relay service, fax, and computers in australia. Journal of Deaf Studies and Deaf Education, v. 12, i. 1, 2006.Google ScholarGoogle Scholar
  28. }}Rangin, H. B. Anatomy of a large-scale social search engine. WWW 2010), 431--440, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. }}Solona, 2010. http://www.solona.net/.Google ScholarGoogle Scholar
  30. }}Sorokin, A., and D. Forsyth. Utility data annotation with amazon mechanical turk. CVPRW 2008, 1--8, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  31. }}Takagi, H., S. Kawanaka, M. Kobayashi, T. Itoh, and C. Asakawa. Social accessibility: achieving accessibility through collaborative metadata authoring. ASSETS 2008, 193--200, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. }}Talking signs. http://www.talkingsigns.com/, 2008.Google ScholarGoogle Scholar
  33. }}Testscout- your mobile reader, 2010. http://www.textscout.eu/en/.Google ScholarGoogle Scholar
  34. }}Lanigan, P., A. M. Paulos, A. W. Williams, and P. Narasimhan. Trinetra: Assistive Technologies for the Blind. Carnegie Mellon University, CyLab, 2006.Google ScholarGoogle Scholar
  35. }}UStream. ustream, 2010. http://www.ustream.tv/.Google ScholarGoogle Scholar
  36. }}Voiceover: Macintosh OS X, 2007. http://www.apple.com/accessibility/voiceover/.Google ScholarGoogle Scholar
  37. }}voice for android, 2010. www.seeingwithsound.com/android.htm.Google ScholarGoogle Scholar
  38. }}von Ahn, L., and L. Dabbish. Labeling images with a computer game. CHI 2004, 319--326, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. }}Yeh, T., J. J. Lee, and T. Darrell. Photo-based question answering. MM 2008, 389--398, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. VizWiz: nearly real-time answers to visual questions

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      UIST '10: Proceedings of the 23nd annual ACM symposium on User interface software and technology
      October 2010
      476 pages
      ISBN:9781450302715
      DOI:10.1145/1866029

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 3 October 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate842of3,967submissions,21%

      Upcoming Conference

      UIST '24

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader