Comparison of voice acquisition methodologies in speech research

Vogel, Adam P.; Maruff, Paul

doi:10.3758/BRM.40.4.982

Comparison of voice acquisition methodologies in speech research

Published: November 2008

Volume 40, pages 982–987, (2008)
Cite this article

Download PDF

Behavior Research Methods Aims and scope Submit manuscript

Comparison of voice acquisition methodologies in speech research

Download PDF

Adam P. Vogel¹ &
Paul Maruff¹

1586 Accesses
28 Citations
Explore all metrics

Abstract

The use of voice acoustic techniques has the potential to extend beyond work devoted purely to speech or vocal pathology. For this to occur, however, researchers and clinicians will require acquisition technologies that provide fast, accurate, and cost-effective methods for recording data. Therefore, the present study aimed to compare industry-standard techniques for acquiring high-quality acoustic signals (e.g., hard drive and solid-state recorder) with widely available and easy-to-use, computer-based (standard laptop) data-acquisition methods. Speech samples were simultaneously acquired from 15 healthy controls using all three methods and were analyzed using identical analysis techniques. Data from all three acquisition methods were directly compared using a variety of acoustic correlates. The results suggested that selected acoustic measures (e.g., f 0, noise-toharmonic ratio, number of pauses) were accurately obtained using all three methods; however, minimum recording standards were required for widely used measures of perturbation.

References

Alpert, M., Rosenberg, S. D., Pouget, E. R., & Shaw, R. J. (2000). Prosody and lexical accuracy in flat affect schizophrenia. Psychiatry Research, 97, 107–118.
Article PubMed Google Scholar
Bielamowicz, S., Kreiman, J., Gerratt, B. R., Dauer, M. S., & Berke, G. S. (1996). Comparison of voice analysis systems for perturbation measurement. Journal of Speech & Hearing Research, 39, 126–134.
Google Scholar
Bough Jr., I. D., Heuer, R. J., Sataloff, R. T., Hills, J. R., & Cater, J. R. (1996). Intrasubject variability of objective voice measures. Journal of Voice, 10, 166–174.
Article PubMed Google Scholar
Cannizzaro, M. S., Reilly, N., Mundt, J. C., & Snyder, P. J. (2005). Remote capture of human voice acoustical data by telephone: A methods study. Clinical Linguistics & Phonetics, 19, 649.
Article Google Scholar
Carding, P. N., Steen, I. N., Webb, A., MacKenzie, K., Deary, I. J., & Wilson, J. A. (2004). The reliability and sensitivity to change of acoustic measures of voice quality. Clinical Otolaryngology & Allied Sciences, 29, 538–544.
Article Google Scholar
Carson, C. P., Ingrisano, D. R., & Eggleston, K. D. (2003). The effect of noise on computer-aided measures of voice: A comparison of CSpeechSP and the Multi-Dimensional Voice Program software using the CSL 4300B Module and Multi-Speech for Windows. Journal of Voice, 17, 12–20.
Article PubMed Google Scholar
Dejonckere, P. H., Bradley, P., Clemente, P., Cornut, G., Crevier-Buchman, L., Friedrich, G., et al. (2001). A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. European Archives of Oto-Rhino-Laryngology, 258, 77–82.
Article PubMed Google Scholar
Deliyski, D. D., Evans, M. K., & Shaw, H. S. (2005). Influence of data acquisition environment on accuracy of acoustic voice quality measurements. Journal of Voice, 19, 176–186.
Article PubMed Google Scholar
Deliyski, D. D., Shaw, H. S., & Evans, M. K. (2005a). Adverse effects of environmental noise on acoustic voice quality measurements. Journal of Voice, 19, 15–28.
Article PubMed Google Scholar
Deliyski, D. D., Shaw, H. S., & Evans, M. K. (2005b). Influence of sampling rate on accuracy and reliability of acoustic voice analysis. Logopedics, Phoniatrics, Vocology, 30, 55–62.
Article PubMed Google Scholar
Deliyski, D. D., Shaw, H. S., Evans, M. K., & Vesselinov, R. (2006). Regression tree approach to studying factors influencing acoustic voice analysis. Folia Phoniatrica et Logopaedica, 58, 274–288.
Article PubMed Google Scholar
Espy-Wilson, C. Y., Boyce, S. E., Jackson, M., Narayanan, S., & Alwan, A. (2000). Acoustic modeling of American English /r/. Journal of the Acoustical Society of America, 108, 343–356.
Article PubMed Google Scholar
Fette, B., Gibson, R., & Greenwood, E. (1980). Windowing functions for the average magnitude difference function pitch extractor. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 5, 49–52.
Google Scholar
Gerhard, D. (2003). Pitch extraction and fundamental frequency: History and current techniques (Tech. Rep. No. TR-CS 2003-06). Regina, SK: University of Regina, Department of Computer Science.
Google Scholar
Green, J. R., Beukelman, D. R., & Ball, L. J. (2004). Algorithmic estimation of pauses in extended speech samples of dysarthric and typical speech. Journal of Medical Speech-Language Pathology, 12, 149–154.
PubMed Google Scholar
Hillenbrand, J. (1987). A methodological study of perturbation and additive noise in synthetically generated voice signals. Journal of Speech, Language, & Hearing Research, 30, 448–461.
Google Scholar
Hirose, K., Fujisaki, H., & Seto, S. (1992). A scheme for pitch extraction of speech using autocorrelation function with frame length proportional to the time lag. Acoustics, Speech, & Signal Processing, 1, 149–152.
Google Scholar
Ingrisano, D. R. S., Perry, C. K., & Jepson, K. R. (1998). Environmental noise: A threat to automatic voice analysis. American Journal of Speech-Language Pathology, 7, 91–96.
Google Scholar
Jiang, J., Lin, E., & Hanson, D. G. (1998). Effect of tape recording on perturbation measures. Journal of Speech, Language, & Hearing Research, 41, 1031–1041.
Google Scholar
Karnell, M. P. (1991). Laryngeal perturbation analysis: Minimum length of analysis window. Journal of Speech & Hearing Research, 34, 544–548.
Google Scholar
Karnell, M. P., Hall, K. D., & Landahl, K. L. (1995). Comparison of fundamental frequency and perturbation measurements among three analysis systems. Journal of Voice, 9, 383–393.
Article PubMed Google Scholar
Karnell, M. P., Scherer, R. S., & Fischer, L. B. (1991). Comparison of acoustic voice perturbation measures among three independent voice laboratories. Journal of Speech & Hearing Research, 34, 781–790.
Google Scholar
Kent, R. D., Vorperian, H. K., & Duffy, J. R. (1999). Reliability of the Multi-Dimensional Voice Program for the analysis of voice samples of subjects with dysarthria. American Journal of Speech-Language Pathology, 8, 129–136.
Google Scholar
Lei, K., Yang, X., Shen, J. Z., & Gong, J. R. (2000). The study on the standardization of adults voice acoustic parameters by objective analysis. Journal of Clinical Otorhinolaryngology, 14, 255–257.
PubMed Google Scholar
Ma, E. P., & Yiu, E. M. (2005). Suitability of acoustic perturbation measures in analysing periodic and nearly periodic voice signals. Folia Phoniatrica et Logopaedica, 57, 38–47.
Article PubMed Google Scholar
Mueller, P. B. (1997). The aging voice. Seminars in Speech & Language, 18, 159–168.
Article Google Scholar
Mundt, J. C., Snyder, P. J., Cannizzaro, M. S., Chappie, K., & Geralts, D. S. (2007). Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. Journal of Neurolinguistics, 20, 50–64.
Article PubMed Google Scholar
Parsa, V., & Jamieson, D. G. (2001). Acoustic discrimination of pathological voice: Sustained vowels versus continuous speech. Journal of Speech, Language, & Hearing Research, 44, 327–339.
Article Google Scholar
Parsa, V., Jamieson, D. G., & Pretty, B. R. (2001). Effects of microphone type on acoustic measures of voice. Journal of Voice, 15, 331–343.
Article PubMed Google Scholar
Perry, C. K., Ingrisano, D. R., & Blair, W. (1996). The influence of recording systems on jitter and shimmer estimates. American Journal of Speech-Language Pathology, 5, 86–90.
Google Scholar
Perry, C. K., Ingrisano, D. R., Palmer, M. A., & McDonald, E. J. (2000). Effects of environmental noise on computer-derived voice estimates from female speakers. Journal of Voice, 14, 146–153.
Article PubMed Google Scholar
Perry, C. K., Ingrisano, D. R., & Scott, S. R. (1996). Accuracy of jitter estimates using different filter settings on Visi-Pitch: A preliminary report. Journal of Voice, 10, 337–341.
Article PubMed Google Scholar
Qi, Y., Hillman, R. E., & Milstein, C. (1999). The estimation of signal-to-noise ratio in continuous speech for disordered voices. Journal of the Acoustical Society of America, 105, 2532–2535.
Article PubMed Google Scholar
Rabiner, L. R. (1977). On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech, & Signal Processing, 25, 24–33.
Article Google Scholar
Roa, S., Bennewitz, M., & Behnke, S. (2007). Fundamental frequency estimation based on pitch-scaled harmonic filtering. In IEEE International Conference on Acoustics, Speech, and Signal Processing, 2007 (pp. IV-397 to IV-400).
Scherer, R. C., Vail, V. J., & Guo, C. G. (1995). Required number of tokens to determine representative voice perturbation values. Journal of Speech & Hearing Research, 38, 1260–1269.
Google Scholar
Smits, I., Ceuppens, P., & De Bodt, M. S. (2005). A comparative study of acoustic voice measurements by means of Dr. Speech and Computerized Speech Lab. Journal of Voice, 19, 187–196.
Article PubMed Google Scholar
Titze, I. R., & Winholtz, W. S. (1993). Effect of microphone type and placement on voice perturbation measurements. Journal of Speech & Hearing Research, 36, 1177–1190.
Google Scholar
Whitmore, J., & Fisher, S. (1996). Speech during sustained operations. Speech Communication, 20, 55–70.
Article Google Scholar
Winholtz, W. S., & Titze, I. R. (1997). Miniature head-mounted microphone for voice perturbation analysis. Journal of Speech, Language, & Hearing Research, 40, 894–899.
Google Scholar
Winholtz, W. S., & Titze, I. R. (1998). Suitability of minidisc (MD) recordings for voice perturbation analysis. Journal of Voice, 12, 138–142.
Article PubMed Google Scholar
Xue, S. A., & Fucci, D. (2000). Effects of race and sex on acoustic features of voice analysis. Perceptual & Motor Skills, 91, 951–958.
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Melbourne, Level 7/21 Victoria Street, 3000, Melbourne, VIC, Australia
Adam P. Vogel & Paul Maruff

Authors

Adam P. Vogel
View author publications
You can also search for this author in PubMed Google Scholar
Paul Maruff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam P. Vogel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vogel, A.P., Maruff, P. Comparison of voice acquisition methodologies in speech research. Behavior Research Methods 40, 982–987 (2008). https://doi.org/10.3758/BRM.40.4.982

Download citation

Received: 23 December 2007
Accepted: 12 March 2008
Issue Date: November 2008
DOI: https://doi.org/10.3758/BRM.40.4.982

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Comparison of voice acquisition methodologies in speech research

Abstract

Article PDF

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Perception of vocoded speech in domestic dogs

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparison of voice acquisition methodologies in speech research

Abstract

Article PDF

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Perception of vocoded speech in domestic dogs

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation