Skip to main content
Log in

In-network outlier detection in wireless sensor networks

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

To address the problem of unsupervised outlier detection in wireless sensor networks, we develop an approach that (1) is flexible with respect to the outlier definition, (2) computes the result in-network to reduce both bandwidth and energy consumption, (3) uses only single-hop communication, thus permitting very simple node failure detection and message reliability assurance mechanisms (e.g., carrier-sense), and (4) seamlessly accommodates dynamic updates to data. We examine performance by simulation, using real sensor data streams. Our results demonstrate that our approach is accurate and imposes reasonable communication and power consumption demands.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Adam N, Janeja V, Atluri V (2004) Neighborhood-based detection of anomalies in high dimensional spatio-temporal sensor datasets. In: Proceedings of ACM symposium on applied computing (SAC04), pp 576–583

  2. Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo A (1996) Fast discovery of association rules. In: Advances in knowledge discovery and data mining, pp 307–328

  3. Ajdler T, Kozintsev I, Lienhart R, Vetterli M (2004) Acoustic source localization in distributed sensor networks. In: Proceedings of the asilomar conference on signals, systems and computers, pp 1328–1332

  4. Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E (2002) A survey on sensor networks. IEEE Commun Mag 40(8): 102–114

    Article  Google Scholar 

  5. Akyildiz IF, Su W, Sankarasubramaniam Y, Cayirci E (2002) Wireless sensor networks: a aurvey. IEEE Trans Syst Man Cybern Part B 38: 393–422

    Google Scholar 

  6. Angiulli F, Pizzuti C (2002) Fast outlier detection in high dimensional spaces. In: Proceedings of the European conference on the principales and practice of data mining and knowledge discovery (PKDD02)

  7. Apiletti D, Baralis E, Cerquitelli T (2010) Energy-saving models for wireless sensor networks. Knowl Inf Syst 28(3): 615–644

    Article  Google Scholar 

  8. Barnett V, Lewis T (1994) Outliers in statistical data. Wiley, New York

    MATH  Google Scholar 

  9. Basu S, Meckesheimer M (2007) Automatic outlier detection for time series: an application to sensor data. Knowl Inf Syst 11: 137–154

    Article  Google Scholar 

  10. Bawa M, Gionis A, Garcia-Molina H, Motwani R (2007) The price of validity in dynamic networks. J Comput Syst Sci 73(3): 245–264

    Article  MathSciNet  MATH  Google Scholar 

  11. Bay S, Schwabacher M (2003) Mining distance-based outliers in near linear time with randomization and a simple pruning rule. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining

  12. Beck A, Stoica P, Li J (2008) Exact and approximate solutions of source localization problems. IEEE Trans Signal Process 56(5): 1770–1778

    Article  MathSciNet  Google Scholar 

  13. Bhaduri K, Kargupta H (2008) A scalable local algorithm for distributed multivariate regression. In: Proceedings of the SIAM conference on data mining (SDM))

  14. Bhaduri K, Wolff R, Giannella C, Kargupta H (2008) Distributed decision tree induction in peer-to-peer systems. Stat Anal Data Mining 1(2): 85–103

    Article  MathSciNet  Google Scholar 

  15. Boyd S, Ghosh A, Prabhakar B, Shah D (2005) Gossip algorithms: design, analysis, and applications. In: Proceedings of IEEE international conference on computer communication (Infocom05), vol 3, pp 1653–1664

  16. Branch J, Chen G, Szymanski B (2005) ESCORT: energy-efficient sensor network communal routing topology using signal quality metrics. In: Proceedings of the international conference on networking (ICN05), pp 438–448

  17. Branch J, Szymanski B, Wolff R, Giannella C, Kargupta H (2006) In-network outlier detection in wireless sensor networks. In: Proceedings of the international conference on distributed computing systems (ICDCS)

  18. Breunig M, Kriegel H-P, Ng R, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of ACM SIGMOD international conference on the management of data (SIGMOD00), pp 93–104

  19. Cerpa A, Estrin D (2004) ASCENT: adaptive self-configuring sensor networks topologies. IEEE Trans Mobile Comput 3(3): 272–285

    Article  Google Scholar 

  20. Chen G, Branch J, Pflug M, Zhu L, Szymanksi B (2004) SENSE: a wireless sensor network simulator. In: Szymanksi B, Yener B (eds) Advances in pervasive computing and networking. Springer, Berlin, pp 249–267

  21. Chen L, Wang Z, Szymanski B, Branch J, Verma D, Damarla R, Ibbotson J (2010) Dynamic service execution in sensor networks. Comput J 53(5): 513–527

    Article  Google Scholar 

  22. Chong S, Gaber M, Krishnaswamy S, Loke L (2011) Energy conservation in wireless sensor networks: a rule-based approach. Knowl Inf Syst 28(3): 579–614

    Article  Google Scholar 

  23. Clemente J, Defago X, Satou K (2003) Asynchronous peer-to-peer communication for failure resilient distributed genetic algorithms. In: Proceedings of the IASTED international conference on parallel and distributed computing and systems (PDCS03), pp 769–773

  24. Crossbow Technology: MPR, MIB user’s manual. http://www.xbow.com

  25. Das K, Bhaduri K, Liu K, Kargupta H (2008) Distributed identification of top-l inner product elements and its application in a peer-to-peer network. IEEE Trans Knowl Data Eng 20(4): 475–488

    Article  Google Scholar 

  26. Datta S, Kargupta H (2007) Uniform data sampling from a peer-to-peer network. In: Proceedings of the international conference on distributed computing systems (ICDCS), p 50

  27. Datta S, Giannella C, Kargupta H (2006) K-means clustering over a large, dynamic network. In: Proceedings of the SIAM international conference on data mining (SDM06), pp 153–164

  28. Estrin D, Govindan R, Heidemann J, Kumar S (1999) Next century challenges: scalable coordination in sensor networks. In: Proceedings of the ACM international conference on mobile computing and networking (MobiCom99), pp 263–270

  29. Fan H, Zaiane O, Foss A, Wu J (2009) Resolution-based outlier factor: detecting the top-n most outlying data points in engineering data. Knowl Inf Syst 19: 31–51

    Article  Google Scholar 

  30. Gupta P, Kumar P (2000) The capacity of wireless networks. IEEE Trans Inf Theory 46(2): 388–404

    Article  MathSciNet  MATH  Google Scholar 

  31. Hautamaki V, Cherednichenko S, Karkkainen I, Kinnunen T, Franti P (2005) Improving K-means by outlier removal. In: Kalviainen H, Parkkinen J, Kaarna A (eds) Image analysis, lecture notes in computer science, vol 3540. Springer, Berlin/Heidelberg, pp 978–987

  32. Hawkins S, He H, Williams G, Baxter R (2002) Outlier detection using replicator neural networks. In: Kambayashi Y, Winiwarter W, Arikawa M (eds) Data warehousing and knowledge discovery, lecture notes in computer science, vol 2454. Springer, Berlin/Heidelberg, pp 113–123

  33. Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22: 85–126

    Article  MATH  Google Scholar 

  34. Holger K, Willig A (2007) Protocols and architectures for wireless sensor networks. Wiley, New York

    Google Scholar 

  35. Intel Berkeley Research Lab: Wireless sensor data. http://db.lcs.mit.edu/labdata/labdata.html

  36. Janakiram D, Reddy VA, Kumar AVUP (2006) Outlier detection in wireless sensor networks using Bayesian belief networks. In: Proceedings of IEEE conference on communication system software and middleware (Comsware06), pp 1–6

  37. John GH (1995) Robust decision trees: removing outliers from databases. In: First international conference on knowledge discovery and data mining. AAAI Press, pp 174–179

  38. Kargupta H, Sivakumar K (2004) Existential pleasures of distributed data mining. In: Kargupta H, Joshi A, Sivakumar K, Yesha Y (eds) Data mining: next generation challenges and future directions. MIT/AAAI Press

  39. Kargupta H, Hamzaoglu I, Stafford B (1997) Scalable, distributed data mining using an agent-based architecture. In: Proceedings of knowledge discovery and data mining, pp 211–214

  40. Kargupta H, Park P, Hershberger D, Johnson E (1999) Collective data mining: a new perspective toward distributed data mining. In: Kargupta H, Chan P (eds) Advances in distributed and parallel knowledge discovery. MIT/AAAI Press

  41. Kempe D, Dobra A, Gehrke J (2003) Computing aggregate information using Gossip. In: Proceedings of the IEEE symposium on foundations of computer science (FoCS03), pp 482–491

  42. Knorr E, Ng R (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the international conference on very large data bases (VLDB98)

  43. Kowalczyk W, Jelasity M, Eiben A (2003) Towards data mining in large and fully distributed peer-to-peer overlay networks. In: Proceedings of Belgian-Dutch conference on artificial intelligence (BNAIC03), pp 203–210

  44. Krivitski D, Schuster A, Wolff R (2007) A local facility location algorithm for large-scale distributed systems. J Grid Comput 5(4): 361–378

    Article  Google Scholar 

  45. Kurita T, Takahashi T, Ikeda Y (2002) A neural network classifier for occluded images. In: International conference on pattern recognition, vol 3, pp 30045–30049

  46. Luo P, Xiong H, Lü K, Shi Z (2007) Distributed classification in peer-to-peer networks. In: Proceedings of SIGKDD’07, pp 968–976

  47. Mebane W (2010) Fraud in the 2009 presidential election in Iran?. Chance 23: 6–15

    Article  Google Scholar 

  48. Mehyar M, Spanos D, Pongsajapan J, Low S, Murray R (2007) Asynchronous distributed averaging on communication networks. IEEE Trans Netw 15(3): 512–529

    Article  Google Scholar 

  49. Mukherjee S, Kargupta H (2008) Distributed probabilistic inferencing in sensor networks using variational approximation. J Parallel Distrib Comput 68(1): 78–92

    Article  MATH  Google Scholar 

  50. Otey M, Ghoting A, Parthasarathy S (2006) Fast distributed outlier detection in mixed-attribute data sets. Data Mining Knowl Discov 12: 203–228

    Article  MathSciNet  Google Scholar 

  51. Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D (2003) Distributed deviation detection in sensor networks. In: ACM SIGMOD Record, pp 77–82

  52. Perkins C, Royer E (1999) Ad-hoc on demand distance vector routing. In: Proceedings of the 2nd IEEE workshop on mobile computing systems and applications, pp 90–100

  53. Radivojac P, Korad U, Sivalingam KM, Obradovic Z (2003) Learning from class-imbalanced data in wireless sensor networks. In: Proceedings of the IEEE 58th vehicular technology conference, vol 5, pp 3030–3034

  54. Rajasegarar S, Leckie C, Palaniswami M, Bezdek J (2006) Distributed anomaly detection in wireless sensor networks. In: Proceedings of the IEEE Singapore international conference on communication systems, pp 1–5

  55. Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large datasets. In: Proceedings of the ACM SIGMOD conference on the management of data (SIGMOD00)

  56. Schurgers C, Tsiatsis V, Srivastava M (2002) STEM: topology management for energy-efficient sensor networks. In: Proceedings of the IEEE aerospace conference, vol 3, pp 1099–1108

  57. Sharfman I, Schuster A, Keren D (2007) A geometric approach to monitoring threshold functions over distributed data streams. ACM Trans Database Syst 32(4)

  58. Sheng B, Li Q, Mao W, Jin W (2007) Outlier detection in sensor networks. In: Proceedings of the 8th ACM international symposium on mobile and ad hoc networking and computing (MobiHoc), pp 219–228

  59. Sheng X, Hu Y-H (2005) Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. IEEE Trans Signal Process 53(1): 44–53

    Article  MathSciNet  Google Scholar 

  60. Shin K, Abraham A, Han (2006) Improving kNN text categorization by removing outliers from training set. In: Gelbukh A (ed) Computational linguistics and intelligent text processing, lecture notes in computer science, vol 3878. Springer, Berlin/Heidelberg, pp 563–566

  61. Simon G, Maroti M, Ledeczi A, Balogh G, Kusy B, Nadas A, Pap G, Sallai J, Frampton K (2004) Sensor network-based countersniper system. In: Proceedings of the international conference on embedded networked sensor systems (SenSys04), pp 1–12

  62. Su L, Han W, Yang S, Zou P, Jia Y (2007) Continuous adaptive outlier detection on distributed data streams. In: Lecture notes in computer science 4782—proceedings of the high performance computation conference (HPCC), pp 74–85

  63. Subramaniam S, Palpanas T, Papadopoulos D, Kalogeraki V, Gunopulos D (2006) Online outlier detection in sensor data using non-parametric models. In: Proceedings of ACM conference on very large databases (VLDB06), pp 187–198

  64. Tietjen G, Moore R (1972) Some grubbs-type statistics for the detection of outliers. Technometrics 14(3): 583–597

    Article  Google Scholar 

  65. Wang Z, Bulut E, Szymanski BK (2010) Distributed energy-efficient target tracking with binary sensor networks. ACM Transactions on Sensor Networks (TOSN) 6(4)

  66. Wasilewski K, Branch J, Lisee M, Szymanski BK (2007) Self-healing routing: a study in efficiency and resiliency of data delivery in wireless sensor networks. In: Proceedings of the conference on unattended ground, sea, and air sensor technologies and applications, SPIE symposium on defense and security

  67. Wolff R, Schuster A (2004) Association rule mining in peer-to-peer systems. IEEE Trans Syst Man Cybern Part B 34(6): 2426–2438

    Article  Google Scholar 

  68. Wolff R, Bhaduri K, Kargupta H (2006) Local L2 thresholding-based data mining in peer-to-peer systems. In: Proceedings of the SIAM international conference on data mining (SDM06), pp 430–441

  69. Wolff R, Bhaduri K, Kargupta H (2009) A generic local algorithm for mining data streams in large distributed systems. IEEE Trans on Knowl Data Eng 21(4): 465–487

    Article  Google Scholar 

  70. Xu Y, Heidemann J, Estrin D (2001) Geography-informed energy conservation for ad hoc routing. In: Proceedings of the ACM international conference on mobile computing and networking (MobiCom01), pp 70–84

  71. Zhuang Y, Chen L (2006) In-network outlier cleaning for data collection in sensor networks. In: Proceedings of the 1st international VLDB workshop on clean databases (CleanDB06)

  72. Zhuang Y, Chen L, Wang X, Lian J (2007) A weighted average-based approach for cleaning sensor data. In: Proceedings of the 27th international conference on distributed computing systems (ICDCS)

  73. Zuniga M, Krishnamachari B (2004) Analyzing the transitional region in low power wireless links. In: Proceedings of the IEEE conference on sensor and ad hoc communications and networks (SECON04), pp 517–526

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chris Giannella.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Branch, J.W., Giannella, C., Szymanski, B. et al. In-network outlier detection in wireless sensor networks. Knowl Inf Syst 34, 23–54 (2013). https://doi.org/10.1007/s10115-011-0474-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-011-0474-5

Keywords

Navigation