Skip to main content
Log in

Optimizing semantic LSTM for spam detection

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

Classifying spam is a topic of ongoing research in the area of natural language processing, especially with the increase in the usage of the Internet for social networking. This has given rise to the increase in spam activity by the spammers who try to take commercial or non-commercial advantage by sending the spam messages. In this paper, we have implemented an evolving area of technique known as deep learning technique. A special architecture known as Long Short Term Memory (LSTM), a variant of the Recursive Neural Network (RNN) is used for spam classification. It has an ability to learn abstract features unlike traditional classifiers, where the features are hand-crafted. Before using the LSTM for classification task, the text is converted into semantic word vectors with the help of word2vec, WordNet and ConceptNet. The classification results are compared with the benchmark classifiers like SVM, Naïve Bayes, ANN, k-NN and Random Forest. Two corpuses are used for comparison of results: SMS Spam Collection dataset and Twitter dataset. The results are evaluated using metrics like Accuracy and F measure. The evaluation of the results shows that LSTM is able to outperform traditional machine learning methods for detection of spam with a considerable margin.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. MAAWG. Messaging anti-abuse working group. Email metrics report. Q1 2012 to Q2 2014. https://www.m3aawg.org/sites/default/files/document/M3AAWG_2012-2014Q2_Spam_Metrics_Report16.pdf. Accessed 30 Mar 2017

  2. Mowbray M (2010) The twittering machine. In: WEBIST (2), pp 299–304

  3. Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, no. 2010, p 12

  4. Mittal N, Agarwal B, Agarwal S, Agarwal S, Gupta P (2013) A hybrid approach for twitter sentiment analysis. In: 10th international conference on natural language processing (ICON-2013), pp 116–120

  5. Ahmed S, Mithun F (2004) Word stemming to enhance spam filtering. In: The conference on email and anti-spam (CEAS’04) 2004

  6. Agarwal B, Mittal N (2016) Prominent feature extraction for sentiment analysis. Springer International Publishing, Berlin, pp 21–45

    Google Scholar 

  7. Khorsi A (2007) An overview of content-based spam filtering techniques. Informatica 31(3):269–277

    MATH  Google Scholar 

  8. Kolari P, Java A, Finin T, Oates T, Joshi A (2006) Detecting spam blogs: a machine learning approach. In: Proceedings of the 21st national conference on artificial intelligence (AAAI), July 2006

  9. Wang AH (2010) Don’t follow me: spam detection in twitter. In: Proceedings of the 2010 international conference on security and cryptography (SECRYPT). IEEE, New York, pp 1–10

  10. Tretyakov K (2004) Machine learning techniques in spam filtering. In: Data mining problem-oriented seminar. MTAT, vol 3, no 177, pp 60–79

  11. Ntoulas A, Najork M, Manasse M, Fetterly D (2006) Detecting spam web pages through content analysis. In: Proceedings of the 15th international conference on World Wide Web. ACM, New York, pp 83–92

  12. Mccord M, Chuah M (2011) Spam detection on twitter using traditional classifiers. In: International conference on autonomic and trusted computing. Springer, Berlin, pp 175–186

  13. SMS Spam Collection v.1. http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/. Accessed 27 Dec 2016

  14. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

    Article  Google Scholar 

  15. Bengio Y (2009) Learning deep architectures for AI. In: Foundations and trends® in machine learning, vol 2, no 1, pp 1–127

  16. Deng L (2014) A tutorial survey of architectures, algorithms, and applications for deep learning. In: APSIPA transactions on signal and information processing, vol 3

  17. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  18. Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 513–520

  19. Tang D, Wei F, Qin B, Liu T, Zhou M (2014) Coooolll: a deep learning system for twitter sentiment classification. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 208–212

  20. Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, New York, pp 8599–8603

  21. Hong J, Fang M (2015) Sentiment analysis with deeply learned distributed representations of variable length texts. Technical report, Stanford University, pp 655–665

  22. Tzortzis G, Likas A (2007) Deep belief networks for spam filtering. In: 19th IEEE international conference on tools with artificial intelligence, 2007. ICTAI 2007, vol 2. IEEE, New York, pp 306–309

  23. Mi G, Gao Y, Tan Y (2015) Apply stacked auto-encoder to spam detection. In: International conference in swarm intelligence. Springer, Cham, pp 3–15

  24. Jain G, Sharma M, Agarwal B (2018) Spam detection on social media using semantic convolutional neural network. Int J Knowl Discov Bioinform (IJKDB) 8(1):12–26

    Article  Google Scholar 

  25. Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53st annual meeting on association for computational linguistics, ACL’15, Stroudsburg, PA, USA. Association for Computational Linguistics

  26. Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1422–1432

  27. Ul-Hasan A, Ahmed SB, Rashid F, Shafait F, Breuel TM (2013) Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: 12th international conference on document analysis and recognition (ICDAR), 2013. IEEE, pp 1061–1065

  28. Wöllmer M, Metallinou A, Eyben F, Schuller B, Narayanan S (2010) Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional lstm modeling. In: Proceedings on INTERSPEECH 2010, Makuhari, Japan, pp 2362–2365

  29. Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: Thirteenth annual conference of the international speech communication association, pp 194–197

  30. Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112

  31. Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41

    Article  Google Scholar 

  32. Liu H, Singh P (2004) ConceptNet—a practical commonsense reasoning tool-kit. BT Technol J 22(4):211–226

    Article  Google Scholar 

  33. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of international conference on learning representations (ICLR)

  34. Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(Jul):2121–2159

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gauri Jain.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jain, G., Sharma, M. & Agarwal, B. Optimizing semantic LSTM for spam detection. Int. j. inf. tecnol. 11, 239–250 (2019). https://doi.org/10.1007/s41870-018-0157-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-018-0157-5

Keywords

Navigation