Article

VTQuest: a voice-based multimodal web-based software system for maps and directions

Authors:
Thomas W. Schneider

Virginia Tech, Blacksburg, VA

Virginia Tech, Blacksburg, VA
View Profile

,
Osman Balci

Virginia Tech, Blacksburg, VA

Virginia Tech, Blacksburg, VA
View Profile

ACM-SE 44: Proceedings of the 44th annual Southeast regional conferenceMarch 2006Pages 300–305https://doi.org/10.1145/1185448.1185517

Published:10 March 2006Publication History

ACM-SE 44: Proceedings of the 44th annual Southeast regional conference

Pages 300–305

ABSTRACT

Finding our way out at a large university campus is a problem. We developed VTQuest, http://sunfish.cs.vt.edu/VTQuestV, as a web-based software system to solve this problem for the campus of Virginia Tech (http://www.vt.edu/). VTQuest enables (a) multimodal interaction with voice, mouse, and keyboard, (b) browsing the campus map, (c) locating a building by name, abbreviation, category, or within a distance on the campus map, (d) locating a room on the floor plan of a building, and (e) obtaining walking directions from one building to another. VTQuest provides these capabilities for 103 buildings with floor plans for most of the buildings. VTQuest is engineered based on Java 2 Platform, Enterprise Edition (J2EE) using Scalable Vector Graphics (SVG) and Speech Application Language Tags (SALT). SVG enables zooming into the maps without losing image quality. The voice interface offers a variety of features including an extensive grammar and out-of-turn interaction.

References

Barker, J., Cooke, M., and Ellis, D. Decoding speech in the presence of other sound sources. In Proceedings of the 6th International Conference on Spoken Language Processing, Beijing, China, October 16-20, 2000.Google Scholar
Chino, T., Kazuhiro, F., and Suzuki, K. Gaze To Talk: a nonverbal interface with meta-communication facility. In Proceedings of the Symposium on Eye Tracking Research and Applications, (Palm Beach Gardens, FL, November 6-8, 2000). ACM Press, New York, NY, 2000, 111. Google ScholarDigital Library
Cohen, P. R., and Oviatt, S. L. The role of voice input for human-machine communication. In Proceedings of the National Academy of Sciences 92, 22, 9921--9927, 1995.Google ScholarCross Ref
Coin, E. Speech is NOT dialog. Speech Technology Magazine 7, 3 (May/June 2002) http://www.speechtechmag.com/issues/7_3/cover/744-1.htmlGoogle Scholar
Green, P., Barker, J., Cooke, M., and Josifovski, L. Handling missing and unreliable information in speech recognition. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Key West, Florida, 2001Google Scholar
Hauptmann, A. G. Speech recognition in the informedia digital video library: uses and limitations. Retrieved 2005. http://zero.inf.cs.cmu.edu/alex/ictai95.pdfGoogle Scholar
Hocek, A. VoiceXML and Next-Generation Voice Services. In Proceedings of the XML Conference and Exposition, Baltimore, MD, 2002.Google Scholar
Microsoft. How to use speech recognition in Windows XP. Retrieved 2005. http://support.microsoft.com/default.aspx?scid=kb;en-us;306901&sd=tech#4Google Scholar
Microsoft. SALT programmer's reference. Retrieved 2005. http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sasdk_salt/html/ST_Programmers_Reference.aspGoogle Scholar
Moraes, I. Architectural tools for enabling speech applications. XML Journal, SYS-CON Publications, Montvale, NJ, 2004.Google Scholar
Potter, S., and Larson, J. A. VoiceXML and SALT: How are they different, and why. Speech Technology Magazine 7, 3 (May/June 2002) http://www.speechtechmag.com/issues/7_3/cover/742-1.htmlGoogle Scholar
SaltForum. Speech Application Language Tags (SALT) forum. Retrieved 2005. http://www.saltforum.org/Google Scholar
Stifelman, L. J., Arons, B., Schmandt, C., and Hulteen, E. A. VoiceNotes: a speech interface for a hand-held voice notetaker. In Proceedings of the ACM CHI 93 Human Factors in Computing Systems Conference (Amsterdam, The Netherlands, April 24-29, 1993). ACM Press, New York, NY, 1993, 179--186. Google ScholarDigital Library
Sun Microsystems. Java 2 Platform, Enterprise Edition (J2EE). Retrieved 2005. http://java.sun.com/j2ee/Google Scholar
Wilson, L. X+V is a markup language, not a Roman math expression. 19 August 2003. http://www-128.ibm.com/developerworks/library/wi-xvlanguage/Google Scholar
World Wide Web Consortium. Scalable Vector Graphics (SVG) 1.1 Specification. Retrieved 2005. http://www.w3.org/TR/SVG/Google Scholar

Index Terms

VTQuest: a voice-based multimodal web-based software system for maps and directions

Recommendations

Exploring Effects of Conversational Fillers on User Perception of Conversational Agents
CHI EA '19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems

Through technological advancements in various areas of our lives, Conversational Agents progressed in their human-likeness. In the field of HCI, however, the use of conversational fillers (e.g., "um," "uh," etc.) by Conversational Agents have not been ...
Read More
The Effect of Multimodal Feedback Presented via a Touch Screen on the Performance of Older Adults
HAID '09: Proceedings of the 4th International Conference on Haptic and Audio Interaction Design

Many IT devices --- such as mobile phones and PDAs --- have recently started to incorporate easy-to-use touch screens. There is an associated need for more effective user interfaces for touch screen devices that have a small screen area. One attempt to ...
Read More
WozARd: a wizard of oz tool for mobile AR
MobileHCI '13: Proceedings of the 15th international conference on Human-computer interaction with mobile devices and services

Wizard of Oz methodology is useful when conducting user studies of a system that is in early development. It is essential to be able to simulate part of the system and to collect feedback from potential users. Using a human to act as the system is one ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ACM-SE 44: Proceedings of the 44th annual Southeast regional conference
March 2006
823 pages
ISBN:1595933158
DOI:10.1145/1185448
General Chair:
Ronaldo Menezes
Florida Institute of Technology
Copyright © 2006 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 10 March 2006
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
J2EE
client-server software
multimodal user interface
scalable vector graphics
voice user interface
web-based software
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate178of377submissions,47%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 4
  Total Citations
  View Citations
- 235
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

VTQuest: a voice-based multimodal web-based software system for maps and directions

ACM-SE 44: Proceedings of the 44th annual Southeast regional conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Exploring Effects of Conversational Fillers on User Perception of Conversational Agents

The Effect of Multimodal Feedback Presented via a Touch Screen on the Performance of Older Adults

WozARd: a wizard of oz tool for mobile AR