Abstract
The higher computational efficiency of the time difference of arrival (TDOA) based sound source localization makes it a preferred choice over steered response power (SRP) methods in real-time applications. However, unlike SRP, its implementation for multiple source localization (MSL) is not straight forward. It includes challenges as accurate feature extraction in unfavourable acoustic conditions, association ambiguity involved in mapping the feature extractions to the corresponding sources and complexity involved in solving the hyperbolic delay equation to estimate the source coordinates. Moreover, the dominating source and early reverberation make the detection of delay associated with the submissive sources further perplexing. Hence, this paper proposes a proficient three-step method for localizing multiple sources from delay estimates. In step 1, the search space region is partitioned into cubic subvolumes, and the delay bound associated with each one is computed. Hereafter, these subvolumes are grouped differently, such that whose associated TDOA bounds are enclosed by a specific delay interval, are clustered together. In step 2, initially, the delay segments and later each subvolume contained by the corresponding delay segment are traced for passing through estimated delay hyperbola. These traced volumes are updated by the weight to measure the likelihood of a source in it. The resultant generates the delay density map in the search space. In the final step, localization enhancement is carried out in the selected volumes using conventional SRP (C-SRP). The validation of the proposed approach is done by carrying out the experiments under different acoustic conditions on the synthesized data and, recordings from SMARD & Audio Visual 16.3 Corpus.
Similar content being viewed by others
References
Cobos M, Antonacci F, Alexandridis A, Mouchtaris A and Lee B 2017 A survey of sound source localization methods in wireless acoustic sensor networks. Wirel. Commun. Mob. Comput. 2017: 1–24. https://doi.org/10.1155/2017/3956282
Li P and Ma X 2009 Robust acoustic source localization with TDOA based RANSAC algorithm. In: Proceedings of Emergency Intelligent Computing Technology and Applications ICIC 2009. Lecture Notes Computer Science, vol. 5754. Springer, Berlin, Heidelberg, pp. 222–227
Argentieri S, Danès P and Souères P 2014 A survey on sound source localization in robotics: from binaural to array processing methods. < hal-01058575 > 1–32. https://doi.org/10.1016/j.csl.2015.03.003
Shen H, Ding Z, Dasgupta S and Zhao C 2014 Multiple source localization in wireless sensor networks based on time of arrival measurement. IEEE Trans. Signal Process. 62: 1938–1949. https://doi.org/10.1109/TSP.2014.2304433
Benesty J 2000 Adaptive eigenvalue decomposition algorithm for passive acoustic source localization. J. Acoust. Soc. Am. 107: 384–391. https://doi.org/10.1121/1.428310
Knapp C H and Carter G C 1976 The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. 24: 320–327. https://doi.org/10.1109/tassp.1976.1162830
Alameda-Pineda X and Horaud R 2012 Geometrically-constrained robust time delay estimation using non-coplanar microphone arrays. In: Proceedings of 20th EUSIPCO, Bucharest, Romania, pp. 1309–1313
Hosseini M S, Rezaie A H and Zanjireh Y 2017 Time difference of arrival estimation of sound source using cross correlation and modified maximum likelihood weighting function. Sci. Iran 24: 3268–3279. https://doi.org/10.24200/sci.2017.4355
Benesty J, Chen J and Huang Y 2004 Time-delay estimation via linear interpolation and cross correlation. IEEE Trans. Speech Audio Process. 12: 509–519. https://doi.org/10.1109/TSA.2004.833008
Liu H, Yang B and Pang C 2017 Multiple sound source localization based on TDOA clustering and multi-path matching pursuit. In: Proceedings of IEEE ICASSP, New Orleans, LA, pp. 3241–3245
Dmochowski J P, Benesty J and Affes S 2007 A generalized steered response power method for computationally viable source localization. IEEE Trans. Audio, Speech Lang. Process. 15: 2510–2526. https://doi.org/10.1109/TASL.2007.906694
Sundar H, Sreenivas T V and Seelamantula C S 2018 TDOA-based multiple acoustic source localization without association ambiguity. IEEE/ACM Trans. Audio, Speech, Lang. Process. 26: 1976–1990. https://doi.org/10.1109/TASLP.2018.2851147
Smith J O and Abel J S 1987 Closed-form least-squares source location estimation from range-difference measurements. IEEE Trans. Acoust. 35: 1661–1669. https://doi.org/10.1109/TASSP.1987.1165089
Alameda-Pineda X and Horaud R 2014 A geometric approach to sound source localization from time-delay estimates. IEEE Trans. Audio, Speech Lang. Process. 22: 1082–1095. https://doi.org/10.1109/TASLP.2014.2317989
Bestagini P, Compagnoni M, Antonacci F, Sarti A and Tubaro S 2014 TDOA-based acoustic source localization in the space–range reference frame. Multidim. Syst. Signal Process. 25: 337–359. https://doi.org/10.1007/s11045-013-0233-8
Jin B, Xu X and Zhang T 2018 Robust time-difference-of-arrival (TDOA) localization using weighted least squares with cone tangent plane constraint. Sensors 18: 1–16. https://doi.org/10.3390/s18030778
Kwon B, Park Y and Park Y S 2010 Analysis of the GCC-PHAT technique for multiple sources. In: Proceedings of International Conference on Control, Automation and Systems (ICCAS), Gyeonggi-do, pp. 2070–2073
Claudio E D Di, Parisi R and Orlandi G 2000 Multi-source localization in reverberant environments by root-music and clustering. In: Proceedings of IEEE ICASSP, Istanbul, Turkey, pp. 921–924
Lathoud G and Odobez J 2007 Short-term spatio–temporal clustering applied to multiple moving speakers. IEEE Trans. Audio Speech Lang. Process. 15: 1696–1710
Hu J S, Yang C H and Wang C K 2009 Estimation of sound source number and directions under a multi-source environment. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robot System 1: 181–186. https://doi.org/10.1109/IROS.2009.5354706
Lee B and Choi J S 2010 Multi-source sound localization using the competitive K-means clustering. In: Proceedings of 15th IEEE Conference on Emerging Technologies and Factory Automation, Bilbao, pp 1–7
Scheuing J and Yang B 2007 Efficient synthesis of approximately consistent graphs for acoustic multi-source localization. In: Proceedings of IEEE ICASSP, Honolulu, HI, pp 501–504
Zannini C M, Cirillo A, Parisi R and Uncini A 2010 Improved TDOA disambiguation techniques for sound source localization in reverberant environments. In: Proceedings of IEEE International Symposium on Circuits and Systems: Nano-Bio Circuit Fabrics and Systems, Paris, pp. 2666–2669
Yang B and Kreißig M 2013 A graph-based approach to assist TDOA based localization. In: Proceedings of 8th International Workshop on Multidimensional System (nDS’13), Erlangen, Germany, pp. 75–80
Levy A, Gannot S and Habets E A P 2011 Multiple-hypothesis extended particle filter for acoustic source localization in reverberant environments. IEEE Trans. Audio Speech Lang. Process. 19: 1540–1555. https://doi.org/10.1109/TASL.2010.2093517
Zotkin D N and Duraiswami R 2004 Accelerated speech source localization via a hierarchical search of steered response power. IEEE Trans. Speech Audio Process. 12: 499–508. https://doi.org/10.1109/TSA2004.832990
Çöteli M B, Olgun O and Hacihabiboǧlu H 2018 Multiple sound source localization with steered response power density and hierarchical grid refinement. IEEE/ACM Trans. Audio Speech Lang. Process. 26:2215–2229. https://doi.org/10.1109/TASLP.2018.2858932
Hadad E and Gannot S 2018 Multi-Speaker Direction of Arrival Estimation using SRP-PHAT Algorithm with a Weighted Histogram. In: Proceedings of International Conference on the Science of Electrical Engineering, Israel, pp. 1–5
Brutti A, Omologo M, Member E and Svaizer P 2010 Multiple source localization based on acoustic map de-emphasis. EURASIP J. Audio, Speech, Music Process 2010: 1–17. https://doi.org/10.1155/2010/147495
Lima M V S, Martins W A, Nunes L O, Biscainho L W P, Ferreira T N, Costa M V M and Lee B 2015 A volumetric SRP with refinement step for sound source localization. IEEE Signal Process. Lett. 22: 1098–1102. https://doi.org/10.1109/LSP.2014.2385864
Cobos M, Marti A and Lopez J J 2010 A Modified SRP-PHAT Functional for Robust Real-Time Sound Source Localization With Scalable Spatial Sampling. IEEE Signal Process. Lett. 18: 71–74. https://doi.org/10.1109/lsp.2010.2091502
Cobos M 2014 A note on the modified and mean-based steered-response power functionals for source localization in noisy and reverberant environments. In: Proceedings of IEEE 6th International Symposium on Communication, Control and Signal Process, Athens, pp. 149–152
Lehmann E A and Johansson A M 2010 Diffuse reverberation model for efficient image-source simulation of room impulse responses. IEEE Trans. Audio, Speech Lang. Process. 18: 1429–1439. https://doi.org/10.1109/TASL.2009.2035038
Lehmann E A and Johansson A M 2015 Prediction of energy decay in room impulse responses simulated with an image-source model. J. Acoust. Soc. Am. 124: 269–277. https://doi.org/10.1121/1.2936367
Kabal P 2002 TSP speech database. McGill Univ, Database Version, pp. 1–39
Fritts L 1997 University of Iowa musical instrument samples. In: Univ. Iowa. http://theremin.music.uiowa.edu/MIS.html
Salvati D, Drioli C and Foresti G L 2017 Exploiting a geometrically sampled grid in the steered response power algorithm for localization improvement. J. Acoust. Soc. Am. 141: 586–601. https://doi.org/10.1121/1.4974289
Habets Emanuel and Sommen P C W 2002 Optimal microphone placement for source localization using time delay estimation. In: Proceedings of Workshop Circuits Systems and Signal Process. ProRISC, pp. 284–287
Brutti A, Omologo M and Svaizer P 2008 Localization of multiple speakers based on a two step acoustic map analysis. In: Proceedings of IEEE ICASSP, Las Vegas, pp. 4349–4352
Nielsen J K, Jensen J R, Jensen S H and Christensen MG 2014 The single-and multichannel audio recordings database (SMARD). In: Proceedings of 14th International Workshop Acoustic Signal Enhancement (IWAENC), pp. 40–44. https://doi.org/10.1109/IWAENC.2014.6953334
Lathoud G, Odobez J and Gatica-Perez D 2004 AV16.3: An audio- visual corpus for speaker localization and tracking. In: Proceedings of MLMI. Lecture Notes Computer Science, vol. 3361. Springer, Berlin, Germany, pp. 182–195
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Boora, R., Dhull, S.K. A TDOA-based multiple source localization using delay density maps. Sādhanā 45, 204 (2020). https://doi.org/10.1007/s12046-020-01453-8
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s12046-020-01453-8