skip to main content
10.1145/2361407.2361422acmconferencesArticle/Chapter ViewAbstractPublication Pagesih-n-mmsecConference Proceedingsconference-collections
research-article

Automatic telephone handset identification by sparse representation of random spectral features

Published:06 September 2012Publication History

ABSTRACT

Speech signals convey information not only for speakers' identity and the spoken language, but also for the acquisition devices used during their recording. Therefore, it is reasonable to perform acquisition device identification by analyzing the recorded speech signal. To this end, the random spectral features (RSFs) are proposed as an intrinsic fingerprint suitable for device identification. The RSFs are extracted from each speech signal by first averaging its spectrogram along the time axis and then by projecting the resulting mean spectrogram onto a Gaussian random matrix of compatible dimensions. By applying a sparse-representation based classifier to the device RSFs, state-of-the-art identification accuracy of 95.55% has been obtained on a set of 8 telephone handsets, from Lincoln-Labs Handset Database (LLHDB).

References

  1. E. Bingham and H. Mannila. Random projection in dimensionality reduction: applications to image and text data. In Proc. 7th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, pages 245--250, San Francisco, California, USA, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. E. Candes and T. Tao. Decoding by linear programming. IEEE Trans. Inform. Theory, 51(12):4203--4215, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3):1--27, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. D. Donoho. For most large underdetermined systems of equations, the minimal l1-norm near-solution approximates the sparsest near-solution. Communications on Pure and Applied Mathematics, 59(7):907--934, 2006.Google ScholarGoogle ScholarCross RefCross Ref
  5. H. Farid. Digital image forensics. Scientific American, 6(298):66--71, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  6. D. Garcia-Romero and C. Y. Espy-Wilson. Automatic acquisition device identification from speech recordings. In Proc. 2010 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, pages 1806--1809, Dallas, Texas, USA, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  7. C. Hanilci, F. Ertas, T. Ertas, and O. Eskidere. Recognition of brand and models of cell-phones from recorded speech signals. IEEE Trans. Information Forensics and Security, 7(2):625--634, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Kraetzer, A. Oermann, J. Dittmann, and A. Lang. Digital audio forensics: a first practical evaluation on microphone and environment classification. In Proc. 9th ACM Workshop Multimedia and Security, pages 63--74, Dallas, Texas, USA, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. R. Maher. Audio forensic examination. IEEE Signal Processing Magazine, 26(2):84--94, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  10. H. Malik and H. Farid. Audio forensics from acoustic reverberation. In Proc. 2010 IEEE Int. Conf. Acoustics Speech and Signal Processing, pages 1710--1713, Dallas, Texas, USA, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  11. A. Oermann, A. Lang, and J. Dittmann. Verifier-tuple for audio-forensic to determine speaker environment. In Proc. 7th ACM Workshop on Multimedia and Security, pages 57--62, New York, NY, USA, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. Reynolds. HTIMIT and LLHDB: speech corpora for the study of handset transducer effects. In Proc. 1997 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, volume 2, pages 1535--1538, Munich, Germany, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma. Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell., 31(2):210--227, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Yang, Z. Qu, and J. Huang. Detecting digital audio forgeries by checking frame offsets. In Proc. 10th ACM workshop on Multimedia and Security, pages 21--26, New York, NY, USA, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic telephone handset identification by sparse representation of random spectral features

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM&Sec '12: Proceedings of the on Multimedia and security
      September 2012
      184 pages
      ISBN:9781450314176
      DOI:10.1145/2361407

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 6 September 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate128of318submissions,40%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader