ISCA Archive SpeechProsody 2016
ISCA Archive SpeechProsody 2016

The Effects of mp3 Compression on Acoustic Measurements of Fundamental Frequency and Pitch Range

Robert Fuchs, Olga Maxwell

Recordings for acoustic research should ideally be made in a lossless format. However, in some cases pre-existing data may be available in a lossy format such as mp3, prompting the question in how far this compromises the accuracy of acoustic measurements. In order to answer this question, we compressed 10 recordings of read speech in different compression rates (16-320 kbps), and reconverted them to wav in order to examine the effect of compression on the commonly used suprasegmental measures of fundamental frequency (f0), pitch range and level. Results suggest that at compression rates between 56 and 320 kbps, measures of f0 and most measures of pitch range and level remain reliable, with mean errors below 2% and often better than that. The skewness of the distribution of f0 measurements, however, shows much greater measurement errors, with mean errors of 6.9%-7.6% at compression rates between 96 kbps and 320 kbps, and 44.8% at 16 kbps. We conclude that mp3 compressed recordings can be subjected to the acoustic measurements tested here. Nevertheless, the indeterminacy added by mp3 compression needs to be taken into account when interpreting measurements.


doi: 10.21437/SpeechProsody.2016-107

Cite as: Fuchs, R., Maxwell, O. (2016) The Effects of mp3 Compression on Acoustic Measurements of Fundamental Frequency and Pitch Range. Proc. Speech Prosody 2016, 523-527, doi: 10.21437/SpeechProsody.2016-107

@inproceedings{fuchs16b_speechprosody,
  author={Robert Fuchs and Olga Maxwell},
  title={{The Effects of mp3 Compression on Acoustic Measurements of Fundamental Frequency and Pitch Range}},
  year=2016,
  booktitle={Proc. Speech Prosody 2016},
  pages={523--527},
  doi={10.21437/SpeechProsody.2016-107}
}