Abstract
The use of voice acoustic techniques has the potential to extend beyond work devoted purely to speech or vocal pathology. For this to occur, however, researchers and clinicians will require acquisition technologies that provide fast, accurate, and cost-effective methods for recording data. Therefore, the present study aimed to compare industry-standard techniques for acquiring high-quality acoustic signals (e.g., hard drive and solid-state recorder) with widely available and easy-to-use, computer-based (standard laptop) data-acquisition methods. Speech samples were simultaneously acquired from 15 healthy controls using all three methods and were analyzed using identical analysis techniques. Data from all three acquisition methods were directly compared using a variety of acoustic correlates. The results suggested that selected acoustic measures (e.g., f 0, noise-toharmonic ratio, number of pauses) were accurately obtained using all three methods; however, minimum recording standards were required for widely used measures of perturbation.
Article PDF
Similar content being viewed by others
References
Alpert, M., Rosenberg, S. D., Pouget, E. R., & Shaw, R. J. (2000). Prosody and lexical accuracy in flat affect schizophrenia. Psychiatry Research, 97, 107–118.
Bielamowicz, S., Kreiman, J., Gerratt, B. R., Dauer, M. S., & Berke, G. S. (1996). Comparison of voice analysis systems for perturbation measurement. Journal of Speech & Hearing Research, 39, 126–134.
Bough Jr., I. D., Heuer, R. J., Sataloff, R. T., Hills, J. R., & Cater, J. R. (1996). Intrasubject variability of objective voice measures. Journal of Voice, 10, 166–174.
Cannizzaro, M. S., Reilly, N., Mundt, J. C., & Snyder, P. J. (2005). Remote capture of human voice acoustical data by telephone: A methods study. Clinical Linguistics & Phonetics, 19, 649.
Carding, P. N., Steen, I. N., Webb, A., MacKenzie, K., Deary, I. J., & Wilson, J. A. (2004). The reliability and sensitivity to change of acoustic measures of voice quality. Clinical Otolaryngology & Allied Sciences, 29, 538–544.
Carson, C. P., Ingrisano, D. R., & Eggleston, K. D. (2003). The effect of noise on computer-aided measures of voice: A comparison of CSpeechSP and the Multi-Dimensional Voice Program software using the CSL 4300B Module and Multi-Speech for Windows. Journal of Voice, 17, 12–20.
Dejonckere, P. H., Bradley, P., Clemente, P., Cornut, G., Crevier-Buchman, L., Friedrich, G., et al. (2001). A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques. European Archives of Oto-Rhino-Laryngology, 258, 77–82.
Deliyski, D. D., Evans, M. K., & Shaw, H. S. (2005). Influence of data acquisition environment on accuracy of acoustic voice quality measurements. Journal of Voice, 19, 176–186.
Deliyski, D. D., Shaw, H. S., & Evans, M. K. (2005a). Adverse effects of environmental noise on acoustic voice quality measurements. Journal of Voice, 19, 15–28.
Deliyski, D. D., Shaw, H. S., & Evans, M. K. (2005b). Influence of sampling rate on accuracy and reliability of acoustic voice analysis. Logopedics, Phoniatrics, Vocology, 30, 55–62.
Deliyski, D. D., Shaw, H. S., Evans, M. K., & Vesselinov, R. (2006). Regression tree approach to studying factors influencing acoustic voice analysis. Folia Phoniatrica et Logopaedica, 58, 274–288.
Espy-Wilson, C. Y., Boyce, S. E., Jackson, M., Narayanan, S., & Alwan, A. (2000). Acoustic modeling of American English /r/. Journal of the Acoustical Society of America, 108, 343–356.
Fette, B., Gibson, R., & Greenwood, E. (1980). Windowing functions for the average magnitude difference function pitch extractor. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 5, 49–52.
Gerhard, D. (2003). Pitch extraction and fundamental frequency: History and current techniques (Tech. Rep. No. TR-CS 2003-06). Regina, SK: University of Regina, Department of Computer Science.
Green, J. R., Beukelman, D. R., & Ball, L. J. (2004). Algorithmic estimation of pauses in extended speech samples of dysarthric and typical speech. Journal of Medical Speech-Language Pathology, 12, 149–154.
Hillenbrand, J. (1987). A methodological study of perturbation and additive noise in synthetically generated voice signals. Journal of Speech, Language, & Hearing Research, 30, 448–461.
Hirose, K., Fujisaki, H., & Seto, S. (1992). A scheme for pitch extraction of speech using autocorrelation function with frame length proportional to the time lag. Acoustics, Speech, & Signal Processing, 1, 149–152.
Ingrisano, D. R. S., Perry, C. K., & Jepson, K. R. (1998). Environmental noise: A threat to automatic voice analysis. American Journal of Speech-Language Pathology, 7, 91–96.
Jiang, J., Lin, E., & Hanson, D. G. (1998). Effect of tape recording on perturbation measures. Journal of Speech, Language, & Hearing Research, 41, 1031–1041.
Karnell, M. P. (1991). Laryngeal perturbation analysis: Minimum length of analysis window. Journal of Speech & Hearing Research, 34, 544–548.
Karnell, M. P., Hall, K. D., & Landahl, K. L. (1995). Comparison of fundamental frequency and perturbation measurements among three analysis systems. Journal of Voice, 9, 383–393.
Karnell, M. P., Scherer, R. S., & Fischer, L. B. (1991). Comparison of acoustic voice perturbation measures among three independent voice laboratories. Journal of Speech & Hearing Research, 34, 781–790.
Kent, R. D., Vorperian, H. K., & Duffy, J. R. (1999). Reliability of the Multi-Dimensional Voice Program for the analysis of voice samples of subjects with dysarthria. American Journal of Speech-Language Pathology, 8, 129–136.
Lei, K., Yang, X., Shen, J. Z., & Gong, J. R. (2000). The study on the standardization of adults voice acoustic parameters by objective analysis. Journal of Clinical Otorhinolaryngology, 14, 255–257.
Ma, E. P., & Yiu, E. M. (2005). Suitability of acoustic perturbation measures in analysing periodic and nearly periodic voice signals. Folia Phoniatrica et Logopaedica, 57, 38–47.
Mueller, P. B. (1997). The aging voice. Seminars in Speech & Language, 18, 159–168.
Mundt, J. C., Snyder, P. J., Cannizzaro, M. S., Chappie, K., & Geralts, D. S. (2007). Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology. Journal of Neurolinguistics, 20, 50–64.
Parsa, V., & Jamieson, D. G. (2001). Acoustic discrimination of pathological voice: Sustained vowels versus continuous speech. Journal of Speech, Language, & Hearing Research, 44, 327–339.
Parsa, V., Jamieson, D. G., & Pretty, B. R. (2001). Effects of microphone type on acoustic measures of voice. Journal of Voice, 15, 331–343.
Perry, C. K., Ingrisano, D. R., & Blair, W. (1996). The influence of recording systems on jitter and shimmer estimates. American Journal of Speech-Language Pathology, 5, 86–90.
Perry, C. K., Ingrisano, D. R., Palmer, M. A., & McDonald, E. J. (2000). Effects of environmental noise on computer-derived voice estimates from female speakers. Journal of Voice, 14, 146–153.
Perry, C. K., Ingrisano, D. R., & Scott, S. R. (1996). Accuracy of jitter estimates using different filter settings on Visi-Pitch: A preliminary report. Journal of Voice, 10, 337–341.
Qi, Y., Hillman, R. E., & Milstein, C. (1999). The estimation of signal-to-noise ratio in continuous speech for disordered voices. Journal of the Acoustical Society of America, 105, 2532–2535.
Rabiner, L. R. (1977). On the use of autocorrelation analysis for pitch detection. IEEE Transactions on Acoustics, Speech, & Signal Processing, 25, 24–33.
Roa, S., Bennewitz, M., & Behnke, S. (2007). Fundamental frequency estimation based on pitch-scaled harmonic filtering. In IEEE International Conference on Acoustics, Speech, and Signal Processing, 2007 (pp. IV-397 to IV-400).
Scherer, R. C., Vail, V. J., & Guo, C. G. (1995). Required number of tokens to determine representative voice perturbation values. Journal of Speech & Hearing Research, 38, 1260–1269.
Smits, I., Ceuppens, P., & De Bodt, M. S. (2005). A comparative study of acoustic voice measurements by means of Dr. Speech and Computerized Speech Lab. Journal of Voice, 19, 187–196.
Titze, I. R., & Winholtz, W. S. (1993). Effect of microphone type and placement on voice perturbation measurements. Journal of Speech & Hearing Research, 36, 1177–1190.
Whitmore, J., & Fisher, S. (1996). Speech during sustained operations. Speech Communication, 20, 55–70.
Winholtz, W. S., & Titze, I. R. (1997). Miniature head-mounted microphone for voice perturbation analysis. Journal of Speech, Language, & Hearing Research, 40, 894–899.
Winholtz, W. S., & Titze, I. R. (1998). Suitability of minidisc (MD) recordings for voice perturbation analysis. Journal of Voice, 12, 138–142.
Xue, S. A., & Fucci, D. (2000). Effects of race and sex on acoustic features of voice analysis. Perceptual & Motor Skills, 91, 951–958.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vogel, A.P., Maruff, P. Comparison of voice acquisition methodologies in speech research. Behavior Research Methods 40, 982–987 (2008). https://doi.org/10.3758/BRM.40.4.982
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.3758/BRM.40.4.982