research-article

Automatic telephone handset identification by sparse representation of random spectral features

Authors:
Yannis Panagakis

Aristotle University of Thessaloniki, Thessaloniki, Greece

Aristotle University of Thessaloniki, Thessaloniki, Greece
View Profile

,
Constantine Kotropoulos

Aristotle University of Thessaloniki, Thessaloniki, Greece

Aristotle University of Thessaloniki, Thessaloniki, Greece
View Profile

MM&Sec '12: Proceedings of the on Multimedia and securitySeptember 2012Pages 91–96https://doi.org/10.1145/2361407.2361422

Published:06 September 2012Publication History

MM&Sec '12: Proceedings of the on Multimedia and security

Pages 91–96

ABSTRACT

Speech signals convey information not only for speakers' identity and the spoken language, but also for the acquisition devices used during their recording. Therefore, it is reasonable to perform acquisition device identification by analyzing the recorded speech signal. To this end, the random spectral features (RSFs) are proposed as an intrinsic fingerprint suitable for device identification. The RSFs are extracted from each speech signal by first averaging its spectrogram along the time axis and then by projecting the resulting mean spectrogram onto a Gaussian random matrix of compatible dimensions. By applying a sparse-representation based classifier to the device RSFs, state-of-the-art identification accuracy of 95.55% has been obtained on a set of 8 telephone handsets, from Lincoln-Labs Handset Database (LLHDB).

References

E. Bingham and H. Mannila. Random projection in dimensionality reduction: applications to image and text data. In Proc. 7th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, pages 245--250, San Francisco, California, USA, 2001. Google ScholarDigital Library
E. Candes and T. Tao. Decoding by linear programming. IEEE Trans. Inform. Theory, 51(12):4203--4215, 2005. Google ScholarDigital Library
C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3):1--27, 2011. Google ScholarDigital Library
D. Donoho. For most large underdetermined systems of equations, the minimal l1-norm near-solution approximates the sparsest near-solution. Communications on Pure and Applied Mathematics, 59(7):907--934, 2006.Google ScholarCross Ref
H. Farid. Digital image forensics. Scientific American, 6(298):66--71, 2008.Google ScholarCross Ref
D. Garcia-Romero and C. Y. Espy-Wilson. Automatic acquisition device identification from speech recordings. In Proc. 2010 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pages 1806--1809, Dallas, Texas, USA, 2010.Google ScholarCross Ref
C. Hanilci, F. Ertas, T. Ertas, and O. Eskidere. Recognition of brand and models of cell-phones from recorded speech signals. IEEE Trans. Information Forensics and Security, 7(2):625--634, 2012.Google ScholarDigital Library
C. Kraetzer, A. Oermann, J. Dittmann, and A. Lang. Digital audio forensics: a first practical evaluation on microphone and environment classification. In Proc. 9th ACM Workshop Multimedia and Security, pages 63--74, Dallas, Texas, USA, 2007. Google ScholarDigital Library
R. Maher. Audio forensic examination. IEEE Signal Processing Magazine, 26(2):84--94, 2009.Google ScholarCross Ref
H. Malik and H. Farid. Audio forensics from acoustic reverberation. In Proc. 2010 IEEE Int. Conf. Acoustics Speech and Signal Processing, pages 1710--1713, Dallas, Texas, USA, 2010.Google ScholarCross Ref
A. Oermann, A. Lang, and J. Dittmann. Verifier-tuple for audio-forensic to determine speaker environment. In Proc. 7th ACM Workshop on Multimedia and Security, pages 57--62, New York, NY, USA, 2005. Google ScholarDigital Library
D. Reynolds. HTIMIT and LLHDB: speech corpora for the study of handset transducer effects. In Proc. 1997 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, volume 2, pages 1535--1538, Munich, Germany, 1997. Google ScholarDigital Library
J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell., 31(2):210--227, 2009. Google ScholarDigital Library
R. Yang, Z. Qu, and J. Huang. Detecting digital audio forgeries by checking frame offsets. In Proc. 10th ACM workshop on Multimedia and Security, pages 21--26, New York, NY, USA, 2008. Google ScholarDigital Library

Index Terms

Automatic telephone handset identification by sparse representation of random spectral features
1. Hardware
  1. Communication hardware, interfaces and storage
    1. Signal processing systems

Recommendations

Telephone Handset Identification by Collaborative Representations

Recorded speech signals convey information not only for the speakers' identity and the spoken language, but also for the acquisition devices used for their recording. Therefore, it is reasonable to perform acquisition device identification by analyzing ...
Read More
Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR

The use of exemplar-based methods, such as support vector machines (SVMs), k-nearest neighbors (kNNs) and sparse representations (SRs), in speech recognition has thus far been limited. Exemplar-based techniques utilize information about individual ...
Read More
Native vs. non-native accent identification using Japanese spoken telephone numbers

In forensic investigations, it would be helpful to be able to identify a speaker's native language based on the sound of their speech. Previous research on foreign accent identification suggested that the identification accuracy can be improved by using ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MM&Sec '12: Proceedings of the on Multimedia and security
September 2012
184 pages
ISBN:9781450314176
DOI:10.1145/2361407
General Chair:
Chang-Tsun Li
University of Warwick, UK
,
Program Chairs:
Jana Dittmann
Otto-von-Guericke University, Germany
,
Stefan Katzenbeisser
Technische Universität Darmstadt, Germany
,
Scott Carver
SUNY Binghamton, USA
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 6 September 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
digital speech forensics
random features
sparse representation
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate128of318submissions,40%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 25
  Total Citations
  View Citations
- 139
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Automatic telephone handset identification by sparse representation of random spectral features

MM&Sec '12: Proceedings of the on Multimedia and security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Telephone Handset Identification by Collaborative Representations

Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR

Native vs. non-native accent identification using Japanese spoken telephone numbers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Automatic telephone handset identification by sparse representation of random spectral features

MM&Sec '12: Proceedings of the on Multimedia and security

ABSTRACT

References

Cited By

Index Terms

Recommendations

Telephone Handset Identification by Collaborative Representations

Exemplar-Based Sparse Representation Features: From TIMIT to LVCSR

Native vs. non-native accent identification using Japanese spoken telephone numbers

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media