research-article

Evolution of Semantic Similarity—A Survey

Authors:
Dhivya Chandrasekaran

Lakehead University, Thunderbay, Ontario

Lakehead University, Thunderbay, Ontario
View Profile

,
Vijay Mago

Lakehead University, Thunderbay, Ontario

Lakehead University, Thunderbay, Ontario
View Profile

Authors Info & Claims

ACM Computing Surveys Volume 54 Issue 2Article No.: 41pp 1–37https://doi.org/10.1145/3440755

Published:18 February 2021Publication History

ACM Computing Surveys

Abstract

Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. To address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network–based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.

References

Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Pasca, and Aitor Soroa. 2009. A study on similarity and relatedness using distributional and WordNet-based approaches. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Citeseer, 19.Google ScholarCross Ref
Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Inigo Lopez-Gazpio, Montse Maritxalar, Rada Mihalcea, et al. 2015. Semeval-2015 task 2: Semantic textual similarity, English, Spanish, and pilot on interpretability. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 252--263.Google ScholarCross Ref
Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Rada Mihalcea, German Rigau, and Janyce Wiebe. 2014. Semeval-2014 task 10: Multilingual semantic textual similarity. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval’14). 81--91.Google ScholarCross Ref
Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez Agirre, Rada Mihalcea, German Rigau Claramunt, and Janyce Wiebe. 2016. Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). ACL (Association for Computational Linguistics), 497--511.Google ScholarCross Ref
Eneko Agirre, Daniel Cer, Mona Diab, and Aitor Gonzalez-Agirre. 2012. Semeval-2012 task 6: A pilot on semantic textual similarity. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics--Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval’12). 385--393.Google Scholar
Eneko Agirre, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, and Weiwei Guo. 2013. * SEM 2013 shared task: Semantic textual similarity. In Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (* SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity. 32--43.Google Scholar
Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 5, 3 (2015), 379--389.Google ScholarCross Ref
Berna Altinel and Murat Can Ganiz. 2018. Semantic text classification: A survey of past and recent advances. Inf. Proc. Manag. 54, 6 (2018), 1129--1153. DOI:https://doi.org/10.1016/j.ipm.2018.08.001Google ScholarCross Ref
Samir Amir, Adrian Tanasescu, and Djamel A. Zighed. 2017. Sentence similarity based on semantic kernels for intelligent text retrieval. J. Intell. Inf. Syst. 48, 3 (2017), 675--689.Google ScholarDigital Library
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15).Google Scholar
Satanjeev Banerjee and Ted Pedersen. 2003. Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 3. 805--810.Google Scholar
Daniel Bär, Chris Biemann, Iryna Gurevych, and Torsten Zesch. 2012. UKP: Computing semantic textual similarity by combining multiple content similarity measures. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics--Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval’12). 435--440.Google Scholar
Marco Baroni, Silvia Bernardini, Adriano Ferraresi, and Eros Zanchetta. 2009. The wacky wide web: A collection of very large linguistically processed web-crawled corpora. Lang. Res. Eval. 43, 3 (2009), 209--226.Google ScholarCross Ref
Marco Baroni, Georgiana Dinu, and Germán Kruszewski. 2014. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238--247.Google ScholarCross Ref
Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A pretrained language model for scientific text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3606--3611.Google ScholarCross Ref
Fabio Benedetti, Domenico Beneventano, Sonia Bergamaschi, and Giovanni Simonini. 2019. Computing inter-document similarity with context semantic analysis. Inf. Syst. 80 (2019), 136--147. DOI:https://doi.org/10.1016/j.is.2018.02.009Google ScholarCross Ref
Christian Bizer, Jens Lehmann, Georgi Kobilarov, Sören Auer, Christian Becker, Richard Cyganiak, and Sebastian Hellmann. 2009. DBpedia-a crystallization point for the web of data. J. Web Seman. 7, 3 (2009), 154--165.Google ScholarDigital Library
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Ling. 5 (2017), 135--146.Google ScholarCross Ref
Antoine Bordes, Sumit Chopra, and Jason Weston. 2014. Question answering with subgraph embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 615--620.Google ScholarCross Ref
Jose Camacho-Collados and Mohammad Taher Pilehvar. 2018. From word to sense embeddings: A survey on vector representations of meaning. J. Artif. Intell. Res. 63 (2018), 743--788.Google ScholarDigital Library
José Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. 2015. Nasari: A novel approach to a semantically aware representation of items. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 567--577.Google ScholarCross Ref
José Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artif. Intell. 240 (2016), 36--64. DOI:https://doi.org/10.1016/j.artint.2016.07.005Google ScholarCross Ref
Nicola Cancedda, Eric Gaussier, Cyril Goutte, and Jean-Michel Renders. 2003. Word-sequence kernels. J. Mach. Learn. Res. 3, Feb. (2003), 1059--1082.Google Scholar
Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, and Lucia Specia. 2017. SemEval-2017 Task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval’17). 1--14.Google ScholarCross Ref
Rudi L. Cilibrasi and Paul M. B. Vitanyi. 2007. The Google similarity distance. IEEE Trans. Knowl. Data Eng. 19, 3 (2007), 370--383.Google ScholarDigital Library
Michael Collins and Nigel Duffy. 2002. Convolution kernels for natural language. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 625--632.Google Scholar
Michael Collins and Nigel Duffy. 2002. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In Proceedings of the 40th Meeting of the Association for Computational Linguistics. 263--270.Google Scholar
Danilo Croce, Simone Filice, Giuseppe Castellucci, and Roberto Basili. 2017. Deep learning in semantic kernel spaces. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 345--354.Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.Google Scholar
Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing search in context: The concept revisited. In Proceedings of the 10th International Conference on World Wide Web. 406--414.Google ScholarDigital Library
Evgeniy Gabrilovich, Shaul Markovitch, et al. 2007. Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 7. 1606--1611.Google Scholar
Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The paraphrase database. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 758--764.Google Scholar
Jian-Bo Gao, Bao-Wen Zhang, and Xiao-Hua Chen. 2015. A WordNet-based semantic similarity measurement combining edge-counting and information content theory. Eng. Applic. Artif. Intell. 39 (2015), 80--88. DOI:https://doi.org/10.1016/j.engappai.2014.11.009Google ScholarCross Ref
Daniela Gerz, Ivan Vulić, Felix Hill, Roi Reichart, and Anna Korhonen. 2016. SimVerb-3500: A large-scale evaluation set of verb similarity. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2173--2182.Google ScholarCross Ref
Goran Glavaš, Marc Franco-Salvador, Simone P. Ponzetto, and Paolo Rosso. 2018. A resource-light method for cross-lingual semantic textual similarity. Knowl.-based Syst. 143 (2018), 1--9. DOI:https://doi.org/10.1016/j.knosys.2017.11.041Google Scholar
James Gorman and James R. Curran. 2006. Scaling distributional similarity to large corpora. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Meeting of the Association for Computational Linguistics. 361--368.Google Scholar
Mohamed Ali Hadj Taieb, Torsten Zesch, and Mohamed Ben Aouicha. 2019. A survey of semantic relatedness evaluation datasets and procedures. Artif. Intell. Rev. (23 Dec. 2019). DOI:https://doi.org/10.1007/s10462-019-09796-3Google Scholar
Basma Hassan, Samir E. Abdelrahman, Reem Bahgat, and Ibrahim Farag. 2019. UESTS: An unsupervised ensemble semantic textual similarity method. IEEE Access 7 (2019), 85462--85482.Google ScholarCross Ref
Hua He and Jimmy Lin. 2016. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 937--948. DOI:https://doi.org/10.18653/v1/N16-1108Google ScholarCross Ref
Felix Hill, Roi Reichart, and Anna Korhonen. 2015. Simlex-999: Evaluating semantic models with (genuine) similarity estimation. Comput. Ling. 41, 4 (2015), 665--695.Google ScholarDigital Library
Johannes Hoffart, Fabian M. Suchanek, Klaus Berberich, and Gerhard Weikum. 2013. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194 (2013), 28--61.Google ScholarDigital Library
Harneet Kaur Janda, Atish Pawar, Shan Du, and Vijay Mago. 2019. Syntactic, semantic and sentiment analysis: The joint effect on automated essay evaluation. IEEE Access 7 (2019), 108486--108503.Google ScholarCross Ref
Jay J. Jiang and David W. Conrath. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the 10th International Conference on Research on Computational Linguistics. 19--33.Google Scholar
Yuncheng Jiang, Wen Bai, Xiaopei Zhang, and Jiaojiao Hu. 2017. Wikipedia-based information content and semantic similarity computation. Inf. Proc. Manag. 53, 1 (2017), 248--265. DOI:https://doi.org/10.1016/j.ipm.2016.09.001Google ScholarDigital Library
Yuncheng Jiang, Xiaopei Zhang, Yong Tang, and Ruihua Nie. 2015. Feature-based approaches to semantic similarity assessment of concepts using Wikipedia. Inf. Proc. Manag. 51, 3 (2015), 215--234.Google ScholarDigital Library
Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. 2019. TinyBERT: Distilling BERT for natural language understanding. Arxiv Preprint Arxiv:1909.10351 (2019).Google Scholar
Tomoyuki Kajiwara and Mamoru Komachi. 2016. Building a monolingual parallel corpus for text simplification using sentence similarity based on alignment between word embeddings. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). 1147--1158.Google Scholar
Sun Kim, Nicolas Fiorini, W. John Wilbur, and Zhiyong Lu. 2017. Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents. J. Biomed. Inf. 75 (2017), 122--127.Google ScholarDigital Library
Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1746--1751.Google ScholarCross Ref
Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the International Conference on Learning Representations.Google Scholar
Thomas K. Landauer and Susan T. Dumais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 2 (1997), 211.Google ScholarCross Ref
Thomas K. Landauer, Peter W. Foltz, and Darrell Laham. 1998. An introduction to latent semantic analysis. Discour. Proc. 25, 2--3 (1998), 259--284.Google ScholarCross Ref
Juan J. Lastra-Díaz and Ana García-Serrano. 2015. A new family of information content models with an experimental survey on WordNet. Knowl.-based Syst. 89 (2015), 509--526.Google Scholar
Juan J. Lastra-Díaz, Ana García-Serrano, Montserrat Batet, Miriam Fernández, and Fernando Chirigati. 2017. HESML: A scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. Inf. Syst. 66 (2017), 97--118.Google ScholarDigital Library
Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, and Eneko Agirre. 2019. A reproducible survey on word embeddings and ontology-based methods for word similarity: Linear combinations outperform the state of the art. Eng. Applic. Artif. Intell. 85 (2019), 645--665. DOI:https://doi.org/10.1016/j.engappai.2019.07.010Google ScholarCross Ref
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 1188--1196.Google ScholarDigital Library
Yuquan Le, Zhi-Jie Wang, Zhe Quan, Jiawei He, and Bin Yao. 2018. ACV-tree: A new method for sentence similarity modeling. In Proceedings of the International Joint Conference on Artificial Intelligence. 4137--4143.Google ScholarCross Ref
Ming Che Lee. 2011. A novel sentence similarity measure for semantic-based expert systems. Exp. Syst. Applic. 38, 5 (2011), 6392--6399.Google ScholarDigital Library
Omer Levy and Yoav Goldberg. 2014. Dependency-based word embeddings. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 302--308.Google ScholarCross Ref
Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2177--2185.Google Scholar
Peipei Li, Haixun Wang, Kenny Q. Zhu, Zhongyuan Wang, and Xindong Wu. 2013. Computing term similarity by large probabilistic isA knowledge. In Proceedings of the 22nd ACM International Conference on Information 8 Knowledge Management. 1401--1410.Google ScholarDigital Library
Yuhua Li, Zuhair A. Bandar, and David McLean. 2003. An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15, 4 (2003), 871--882.Google ScholarDigital Library
Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett. 2006. Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18, 8 (2006), 1138--1150.Google ScholarDigital Library
Dekang Lin et al. 1998. An information-theoretic definition of similarity. In Proceedings of the International Conference on Machine Learning (ICML’98). 296--304.Google Scholar
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. Arxiv Preprint Arxiv:1907.11692 (2019).Google Scholar
I. Lopez-Gazpio, M. Maritxalar, A. Gonzalez-Agirre, G. Rigau, L. Uria, and E. Agirre. 2017. Interpretable semantic textual similarity: Finding and explaining differences between sentences. Knowl.-based Syst. 119 (2017), 186--199. DOI:https://doi.org/10.1016/j.knosys.2016.12.013Google Scholar
I. Lopez-Gazpio, M. Maritxalar, M. Lapata, and E. Agirre. 2019. Word n-gram attention models for sentence similarity and inference. Exp. Syst. Applic. 132 (2019), 1--11. DOI:https://doi.org/10.1016/j.eswa.2019.04.054Google ScholarCross Ref
Kevin Lund and Curt Burgess. 1996. Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Meth. Instrum. Comput. 28, 2 (1996), 203--208.Google ScholarCross Ref
M. Marelli, S. Menini, M. Baroni, L. Bentivogli, R. Bernardi, and R. Zamparelli. 2014. A SICK cure for the evaluation of compositional distributional semantic models. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), Reykjavik, Iceland, 216--223. http://www.lrec-conf.org/proceedings/lrec2014/pdf/363_Paper.pdf.Google Scholar
Bryan McCann, James Bradbury, Caiming Xiong, and Richard Socher. 2017. Learned in translation: Contextualized word vectors. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran 8 Associates Inc., 6297--6308.Google Scholar
Bridget T. McInnes, Ying Liu, Ted Pedersen, Genevieve B. Melton, and Serguei V. Pakhomov. 2013. UMLS: Similarity: Measuring the relatedness and similarity of biomedical concepts. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 28.Google Scholar
Christopher Meek, Yang Yi, and Yih Wen-tau. 2018. WIKIQA: A challenge dataset for open-domain question answering. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing September 2015. 2013--2018. https://doi.org/10.18653/v1/D15-1237Google Scholar
Oren Melamud, Jacob Goldberger, and Ido Dagan. 2016. context2vec: Learning generic context embedding with bidirectional LSTM. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. 51--61.Google ScholarCross Ref
Rada Mihalcea and Andras Csomai. 2007. Wikify! Linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM Conference on Information and Knowledge Management. 233--242.Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. Arxiv Preprint Arxiv:1301.3781 (2013).Google Scholar
Tomáš Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 746--751.Google Scholar
George A. Miller. 1995. WordNet: A lexical database for English. Commun. ACM 38, 11 (1995), 39--41.Google ScholarDigital Library
George A. Miller and Walter G. Charles. 1991. Contextual correlates of semantic similarity. Lang. Cog. Proc. 6, 1 (1991), 1--28.Google ScholarCross Ref
Andriy Mnih and Koray Kavukcuoglu. 2013. Learning word embeddings efficiently with noise-contrastive estimation. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2265--2273.Google Scholar
Muhidin Mohamed and Mourad Oussalah. 2019. SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis. Inf. Proc. Manag. 56, 4 (2019), 1356--1372.Google ScholarCross Ref
Saif M. Mohammad and Graeme Hirst. 2012. Distributional measures of semantic distance: A survey. Arxiv Preprint Arxiv:1203.1858 (2012).Google Scholar
Alessandro Moschitti. 2006. Efficient convolution kernels for dependency and constituent syntactic trees. In Proceedings of the European Conference on Machine Learning. Springer, 318--329.Google ScholarDigital Library
Alessandro Moschitti. 2008. Kernel methods, syntax and semantics for relational text categorization. In Proceedings of the 17th ACM Conference on Information and Knowledge Management. 253--262.Google ScholarDigital Library
Alessandro Moschitti, Daniele Pighin, and Roberto Basili. 2008. Tree kernels for semantic role labeling. Comput. Ling. 34, 2 (2008), 193--224.Google ScholarDigital Library
Alessandro Moschitti and Silvia Quarteroni. 2008. Kernels on linguistic structures for answer extraction. In Proceedings of the Conference of the Association for Computational Linguistics: Human Language Technologies, Short Papers. 113--116.Google ScholarDigital Library
Alessandro Moschitti, Silvia Quarteroni, Roberto Basili, and Suresh Manandhar. 2007. Exploiting syntactic and shallow semantic kernels for question answer classification. In Proceedings of the 45th Meeting of the Association of Computational Linguistics. 776--783.Google Scholar
Alessandro Moschitti and Fabio Massimo Zanzotto. 2007. Fast and effective kernels for relational learning from texts. In Proceedings of the 24th International Conference on Machine Learning. 649--656.Google ScholarDigital Library
Roberto Navigli and Simone Paolo Ponzetto. 2012. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193 (2012).Google Scholar
Douglas L. Nelson, Cathy L. McEvoy, and Thomas A. Schreiber. 2004. The University of South Florida free association, rhyme, and word fragment norms. Behav. Res. Meth. Instrum. Comput. 36, 3 (2004), 402--407.Google ScholarCross Ref
Joakim Nivre. 2006. Inductive Dependency Parsing. Springer.Google Scholar
Matteo Pagliardini, Prakhar Gupta, and Martin Jaggi. 2018. Unsupervised learning of sentence embeddings using compositional n-gram features. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 528--540.Google ScholarCross Ref
Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. 2016. A decomposable attention model for natural language inference. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2249--2255.Google ScholarCross Ref
A. Pawar and V. Mago. 2019. Challenging the boundaries of unsupervised learning for semantic similarity. IEEE Access 7 (2019), 16291--16308.Google ScholarCross Ref
Ted Pedersen, Serguei V. S. Pakhomov, Siddharth Patwardhan, and Christopher G. Chute. 2007. Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inf. 40, 3 (2007), 288--299.Google ScholarDigital Library
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532--1543.Google Scholar
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2227--2237.Google ScholarCross Ref
Mohammad Taher Pilehvar and Jose Camacho-Collados. 2019. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 1267--1273.Google Scholar
Mohammad Taher Pilehvar, David Jurgens, and Roberto Navigli. 2013. Align, disambiguate and walk: A unified approach for measuring semantic similarity. In Proceedings of the 51st Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1341--1351.Google Scholar
Mohammad Taher Pilehvar and Roberto Navigli. 2015. From senses to texts: An all-in-one graph-based approach for measuring semantic similarity. Artif. Intell. 228 (2015), 95--128. DOI:https://doi.org/10.1016/j.artint.2015.07.005.Google ScholarDigital Library
Rong Qu, Yongyi Fang, Wen Bai, and Yuncheng Jiang. 2018. Computing semantic similarity based on novel models of semantic representation using Wikipedia. Inf. Proc. Manag. 54, 6 (2018), 1002--1021. DOI:https://doi.org/10.1016/j.ipm.2018.07.002.Google ScholarCross Ref
Z. Quan, Z. Wang, Y. Le, B. Yao, K. Li, and J. Yin. 2019. An efficient framework for sentence similarity modeling. IEEE/ACM Trans. Aud. Speech Lang. Proc. 27, 4 (Apr. 2019), 853--865. DOI:https://doi.org/10.1109/TASLP.2019.2899494.Google Scholar
Roy Rada, Hafedh Mili, Ellen Bicknell, and Maria Blettner. 1989. Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cyber. 19, 1 (1989), 17--30.Google ScholarCross Ref
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. Arxiv Preprint Arxiv:1910.10683 (2019).Google Scholar
Philip Resnik. 1995. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence. 448--453.Google ScholarDigital Library
M. Andrea Rodríguez and Max J. Egenhofer. 2003. Determining semantic similarity among entity classes from different ontologies. IEEE Trans. Knowl. Data Eng. 15, 2 (2003), 442--456.Google ScholarDigital Library
Terry Ruas, William Grosky, and Akiko Aizawa. 2019. Multi-sense embeddings through a word sense disambiguation process. Exp. Syst. Applic. 136 (2019), 288--303. DOI:https://doi.org/10.1016/j.eswa.2019.06.026Google ScholarCross Ref
Herbert Rubenstein and John B. Goodenough. 1965. Contextual correlates of synonymy. Commun. ACM 8, 10 (1965), 627--633.Google ScholarDigital Library
David Sánchez, Montserrat Batet, and David Isern. 2011. Ontology-based information content computation. Knowl.-based Syst. 24, 2 (2011), 297--303.Google Scholar
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. Arxiv Preprint Arxiv:1910.01108 (2019).Google Scholar
Frane Šarić, Goran Glavaš, Mladen Karan, Jan Šnajder, and Bojana Dalbelo Bašić. 2012. Takelab: Systems for measuring semantic text similarity. In * SEM 2012: The 1st Joint Conference on Lexical and Computational Semantics--Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval’12). 441--448.Google Scholar
Tobias Schnabel, Igor Labutov, David Mimno, and Thorsten Joachims. 2015. Evaluation methods for unsupervised word embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 298--307.Google ScholarCross Ref
Aliaksei Severyn and Alessandro Moschitti. 2012. Structural relationships for large-scale learning of answer re-ranking. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. 741--750.Google ScholarDigital Library
Aliaksei Severyn, Massimo Nicosia, and Alessandro Moschitti. 2013. Learning semantic textual similarity with structural representations. In Proceedings of the 51st Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 714--718.Google Scholar
Yang Shao. 2017. HCTI at SemEval-2017 task 1: Use convolutional neural network to evaluate semantic textual similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval’17). 130--133.Google ScholarCross Ref
John Shawe-Taylor, Nello Cristianini et al. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press.Google Scholar
Carina Silberer and Mirella Lapata. 2014. Learning grounded meaning representations with autoencoders. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 721--732.Google ScholarCross Ref
Roberta A. Sinoara, Jose Camacho-Collados, Rafael G. Rossi, Roberto Navigli, and Solange O. Rezende. 2019. Knowledge-enhanced document embeddings for text classification. Knowl.-based Syst. 163 (2019), 955--971. DOI:https://doi.org/10.1016/j.knosys.2018.10.026Google Scholar
Gizem Soǧancıoǧlu, Hakime Öztürk, and Arzucan Özgür. 2017. BIOSSES: A semantic sentence similarity estimation system for the biomedical domain. Bioinformatics 33, 14 (07 2017), i49--i58. arXiv: Retrieved from https://academic.oup.com/bioinformatics/article-pdf/33/14/i49/2515 7316/btx238.pdf.Google Scholar
Md Arafat Sultan, Steven Bethard, and Tamara Sumner. 2014. DLS@ CU: Sentence similarity from word alignment. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval’14). 241--246.Google ScholarCross Ref
Md Arafat Sultan, Steven Bethard, and Tamara Sumner. 2015. DLS@ CU: Sentence similarity from word alignment and semantic vector composition. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 148--153.Google ScholarCross Ref
Yu Sun, Shuohuan Wang, Yu-Kun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. ERNIE 2.0: A continual pre-training framework for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence. 8968--8975.Google ScholarCross Ref
David Sánchez and Montserrat Batet. 2013. A semantic similarity method based on information content exploiting multiple ontologies. Exp. Syst. Applic. 40, 4 (2013), 1393--1399. DOI:https://doi.org/10.1016/j.eswa.2012.08.049Google ScholarDigital Library
David Sánchez, Montserrat Batet, David Isern, and Aida Valls. 2012. Ontology-based semantic similarity: A new feature-based approach. Exp. Syst. Applic. 39, 9 (2012), 7718--7728. DOI:https://doi.org/10.1016/j.eswa.2012.01.082Google ScholarDigital Library
Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1556--1566.Google Scholar
Junfeng Tian, Zhiheng Zhou, Man Lan, and Yuanbin Wu. 2017. ECNU at SemEval-2017 task 1: Leverage kernel-based traditional NLP features and neural networks to build a universal model for multilingual and cross-lingual semantic textual similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval’17). 191--197.Google ScholarCross Ref
Nguyen Huy Tien, Nguyen Minh Le, Yamasaki Tomohiro, and Izuha Tatsuya. 2019. Sentence modeling via multiple word embeddings and multi-level comparison for semantic textual similarity. Inf. Proc. Manag. 56, 6 (2019), 102090. DOI:https://doi.org/10.1016/j.ipm.2019.102090Google ScholarCross Ref
Julien Tissier, Christophe Gravier, and Amaury Habrard. 2017. Dict2vec: Learning Word embeddings using lexical dictionaries. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 254--263.Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the International Conference on Advances in Neural Information Processing Systems.Google Scholar
Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. 2007. What is the jeopardy model? A quasi-synchronous grammar for QA. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). 22--32.Google Scholar
Zhiguo Wang, Haitao Mi, and Abraham Ittycheriah. 2016. Sentence similarity learning by lexical decomposition and composition. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 1340--1349. arXiv:1602.07019.Google Scholar
Zhibiao Wu and Martha Palmer. 1994. Verbs semantics and lexical selection. In Proceedings of the 32nd Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 133--138.Google ScholarDigital Library
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 5753--5763.Google Scholar
G. Zhu and C. A. Iglesias. 2017. Computing semantic similarity of concepts in knowledge graphs. IEEE Trans. Knowl. Data Eng. 29, 1 (Jan. 2017), 72--85. DOI:https://doi.org/10.1109/TKDE.2016.2610428Google ScholarDigital Library
Will Y. Zou, Richard Socher, Daniel Cer, and Christopher D. Manning. 2013. Bilingual word embeddings for phrase-based machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1393--1398.Google Scholar

Index Terms

Evolution of Semantic Similarity—A Survey

Recommendations

Hybrid Method for Semantic Similarity Computation Using Weighted Components in Ontology

In this paper, the researchers propose an approach to measure the semantic similarity between two concepts in an ontology like WordNet and DBpedia. Some earlier semantic similarity approaches proposed concentrated on the ontology structure between ...
Read More
Measuring Word Semantic Similarity Based on Transferred Vectors
Neural Information Processing
Abstract
Semantic similarity between words has now become a popular research problem to tackle in natural language processing (NLP) field. Word embedding have been demonstrated progress in measuring word similarity recently. However, limited to the ...
Read More
Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs

Display Omitted Objective: develop a method to quantify the similarity and relatedness of biomedical and clinical term pairs.Semantic similarity and relatedness measures exploit information extrapolated from the Unified Medical Language System.Evaluates ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Computing Surveys Volume 54, Issue 2
March 2022
800 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/3450359
Editor:
Albert Zomaya
University of Sydney, Australia
Issue’s Table of Contents
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 February 2021
- Accepted: 1 November 2020
- Revised: 1 September 2020
- Received: 1 April 2020
Published in csur Volume 54, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Semantic similarity
corpus-based methods
knowledge-based methods
linguistics
supervised and unsupervised methods
word embeddings
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 110
  Total Citations
  View Citations
- 3,650
  Total Downloads
- Downloads (Last 12 months)1,273
- Downloads (Last 6 weeks)205
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

Evolution of Semantic Similarity—A Survey

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Hybrid Method for Semantic Similarity Computation Using Weighted Components in Ontology

Measuring Word Semantic Similarity Based on Transferred Vectors

Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs