Abstract
Estimating the semantic similarity between text data is one of the challenging and open research problems in the field of Natural Language Processing (NLP). The versatility of natural language makes it difficult to define rule-based methods for determining semantic similarity measures. To address this issue, various semantic similarity methods have been proposed over the years. This survey article traces the evolution of such methods beginning from traditional NLP techniques such as kernel-based methods to the most recent research work on transformer-based models, categorizing them based on their underlying principles as knowledge-based, corpus-based, deep neural network–based methods, and hybrid methods. Discussing the strengths and weaknesses of each method, this survey provides a comprehensive view of existing systems in place for new researchers to experiment and develop innovative ideas to address the issue of semantic similarity.
- Eneko Agirre, Enrique Alfonseca, Keith Hall, Jana Kravalova, Marius Pasca, and Aitor Soroa. 2009. A study on similarity and relatedness using distributional and WordNet-based approaches. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Citeseer, 19.Google ScholarCross Ref
- Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Inigo Lopez-Gazpio, Montse Maritxalar, Rada Mihalcea, et al. 2015. Semeval-2015 task 2: Semantic textual similarity, English, Spanish, and pilot on interpretability. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 252--263.Google ScholarCross Ref
- Eneko Agirre, Carmen Banea, Claire Cardie, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Weiwei Guo, Rada Mihalcea, German Rigau, and Janyce Wiebe. 2014. Semeval-2014 task 10: Multilingual semantic textual similarity. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval’14). 81--91.Google ScholarCross Ref
- Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez Agirre, Rada Mihalcea, German Rigau Claramunt, and Janyce Wiebe. 2016. Semeval-2016 task 1: Semantic textual similarity, monolingual and cross-lingual evaluation. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval’16). ACL (Association for Computational Linguistics), 497--511.Google ScholarCross Ref
- Eneko Agirre, Daniel Cer, Mona Diab, and Aitor Gonzalez-Agirre. 2012. Semeval-2012 task 6: A pilot on semantic textual similarity. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics--Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval’12). 385--393.Google Scholar
- Eneko Agirre, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, and Weiwei Guo. 2013. * SEM 2013 shared task: Semantic textual similarity. In Proceedings of the 2nd Joint Conference on Lexical and Computational Semantics (* SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity. 32--43.Google Scholar
- Alexander M. Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 5, 3 (2015), 379--389.Google ScholarCross Ref
- Berna Altinel and Murat Can Ganiz. 2018. Semantic text classification: A survey of past and recent advances. Inf. Proc. Manag. 54, 6 (2018), 1129--1153. DOI:https://doi.org/10.1016/j.ipm.2018.08.001Google ScholarCross Ref
- Samir Amir, Adrian Tanasescu, and Djamel A. Zighed. 2017. Sentence similarity based on semantic kernels for intelligent text retrieval. J. Intell. Inf. Syst. 48, 3 (2017), 675--689.Google ScholarDigital Library
- Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15).Google Scholar
- Satanjeev Banerjee and Ted Pedersen. 2003. Extended gloss overlaps as a measure of semantic relatedness. In Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 3. 805--810.Google Scholar
- Daniel Bär, Chris Biemann, Iryna Gurevych, and Torsten Zesch. 2012. UKP: Computing semantic textual similarity by combining multiple content similarity measures. In * SEM 2012: The First Joint Conference on Lexical and Computational Semantics--Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval’12). 435--440.Google Scholar
- Marco Baroni, Silvia Bernardini, Adriano Ferraresi, and Eros Zanchetta. 2009. The wacky wide web: A collection of very large linguistically processed web-crawled corpora. Lang. Res. Eval. 43, 3 (2009), 209--226.Google ScholarCross Ref
- Marco Baroni, Georgiana Dinu, and Germán Kruszewski. 2014. Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 238--247.Google ScholarCross Ref
- Iz Beltagy, Kyle Lo, and Arman Cohan. 2019. SciBERT: A pretrained language model for scientific text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3606--3611.Google ScholarCross Ref
- Fabio Benedetti, Domenico Beneventano, Sonia Bergamaschi, and Giovanni Simonini. 2019. Computing inter-document similarity with context semantic analysis. Inf. Syst. 80 (2019), 136--147. DOI:https://doi.org/10.1016/j.is.2018.02.009Google ScholarCross Ref
- Christian Bizer, Jens Lehmann, Georgi Kobilarov, Sören Auer, Christian Becker, Richard Cyganiak, and Sebastian Hellmann. 2009. DBpedia-a crystallization point for the web of data. J. Web Seman. 7, 3 (2009), 154--165.Google ScholarDigital Library
- Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2017. Enriching word vectors with subword information. Trans. Assoc. Comput. Ling. 5 (2017), 135--146.Google ScholarCross Ref
- Antoine Bordes, Sumit Chopra, and Jason Weston. 2014. Question answering with subgraph embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 615--620.Google ScholarCross Ref
- Jose Camacho-Collados and Mohammad Taher Pilehvar. 2018. From word to sense embeddings: A survey on vector representations of meaning. J. Artif. Intell. Res. 63 (2018), 743--788.Google ScholarDigital Library
- José Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. 2015. Nasari: A novel approach to a semantically aware representation of items. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 567--577.Google ScholarCross Ref
- José Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. 2016. Nasari: Integrating explicit knowledge and corpus statistics for a multilingual representation of concepts and entities. Artif. Intell. 240 (2016), 36--64. DOI:https://doi.org/10.1016/j.artint.2016.07.005Google ScholarCross Ref
- Nicola Cancedda, Eric Gaussier, Cyril Goutte, and Jean-Michel Renders. 2003. Word-sequence kernels. J. Mach. Learn. Res. 3, Feb. (2003), 1059--1082.Google Scholar
- Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, and Lucia Specia. 2017. SemEval-2017 Task 1: Semantic textual similarity multilingual and crosslingual focused evaluation. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval’17). 1--14.Google ScholarCross Ref
- Rudi L. Cilibrasi and Paul M. B. Vitanyi. 2007. The Google similarity distance. IEEE Trans. Knowl. Data Eng. 19, 3 (2007), 370--383.Google ScholarDigital Library
- Michael Collins and Nigel Duffy. 2002. Convolution kernels for natural language. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 625--632.Google Scholar
- Michael Collins and Nigel Duffy. 2002. New ranking algorithms for parsing and tagging: Kernels over discrete structures, and the voted perceptron. In Proceedings of the 40th Meeting of the Association for Computational Linguistics. 263--270.Google Scholar
- Danilo Croce, Simone Filice, Giuseppe Castellucci, and Roberto Basili. 2017. Deep learning in semantic kernel spaces. In Proceedings of the 55th Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 345--354.Google ScholarCross Ref
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171--4186.Google Scholar
- Lev Finkelstein, Evgeniy Gabrilovich, Yossi Matias, Ehud Rivlin, Zach Solan, Gadi Wolfman, and Eytan Ruppin. 2001. Placing search in context: The concept revisited. In Proceedings of the 10th International Conference on World Wide Web. 406--414.Google ScholarDigital Library
- Evgeniy Gabrilovich, Shaul Markovitch, et al. 2007. Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In Proceedings of the International Joint Conference on Artificial Intelligence, Vol. 7. 1606--1611.Google Scholar
- Juri Ganitkevitch, Benjamin Van Durme, and Chris Callison-Burch. 2013. PPDB: The paraphrase database. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 758--764.Google Scholar
- Jian-Bo Gao, Bao-Wen Zhang, and Xiao-Hua Chen. 2015. A WordNet-based semantic similarity measurement combining edge-counting and information content theory. Eng. Applic. Artif. Intell. 39 (2015), 80--88. DOI:https://doi.org/10.1016/j.engappai.2014.11.009Google ScholarCross Ref
- Daniela Gerz, Ivan Vulić, Felix Hill, Roi Reichart, and Anna Korhonen. 2016. SimVerb-3500: A large-scale evaluation set of verb similarity. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2173--2182.Google ScholarCross Ref
- Goran Glavaš, Marc Franco-Salvador, Simone P. Ponzetto, and Paolo Rosso. 2018. A resource-light method for cross-lingual semantic textual similarity. Knowl.-based Syst. 143 (2018), 1--9. DOI:https://doi.org/10.1016/j.knosys.2017.11.041Google Scholar
- James Gorman and James R. Curran. 2006. Scaling distributional similarity to large corpora. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Meeting of the Association for Computational Linguistics. 361--368.Google Scholar
- Mohamed Ali Hadj Taieb, Torsten Zesch, and Mohamed Ben Aouicha. 2019. A survey of semantic relatedness evaluation datasets and procedures. Artif. Intell. Rev. (23 Dec. 2019). DOI:https://doi.org/10.1007/s10462-019-09796-3Google Scholar
- Basma Hassan, Samir E. Abdelrahman, Reem Bahgat, and Ibrahim Farag. 2019. UESTS: An unsupervised ensemble semantic textual similarity method. IEEE Access 7 (2019), 85462--85482.Google ScholarCross Ref
- Hua He and Jimmy Lin. 2016. Pairwise word interaction modeling with deep neural networks for semantic similarity measurement. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 937--948. DOI:https://doi.org/10.18653/v1/N16-1108Google ScholarCross Ref
- Felix Hill, Roi Reichart, and Anna Korhonen. 2015. Simlex-999: Evaluating semantic models with (genuine) similarity estimation. Comput. Ling. 41, 4 (2015), 665--695.Google ScholarDigital Library
- Johannes Hoffart, Fabian M. Suchanek, Klaus Berberich, and Gerhard Weikum. 2013. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194 (2013), 28--61.Google ScholarDigital Library
- Harneet Kaur Janda, Atish Pawar, Shan Du, and Vijay Mago. 2019. Syntactic, semantic and sentiment analysis: The joint effect on automated essay evaluation. IEEE Access 7 (2019), 108486--108503.Google ScholarCross Ref
- Jay J. Jiang and David W. Conrath. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. In Proceedings of the 10th International Conference on Research on Computational Linguistics. 19--33.Google Scholar
- Yuncheng Jiang, Wen Bai, Xiaopei Zhang, and Jiaojiao Hu. 2017. Wikipedia-based information content and semantic similarity computation. Inf. Proc. Manag. 53, 1 (2017), 248--265. DOI:https://doi.org/10.1016/j.ipm.2016.09.001Google ScholarDigital Library
- Yuncheng Jiang, Xiaopei Zhang, Yong Tang, and Ruihua Nie. 2015. Feature-based approaches to semantic similarity assessment of concepts using Wikipedia. Inf. Proc. Manag. 51, 3 (2015), 215--234.Google ScholarDigital Library
- Xiaoqi Jiao, Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Linlin Li, Fang Wang, and Qun Liu. 2019. TinyBERT: Distilling BERT for natural language understanding. Arxiv Preprint Arxiv:1909.10351 (2019).Google Scholar
- Tomoyuki Kajiwara and Mamoru Komachi. 2016. Building a monolingual parallel corpus for text simplification using sentence similarity based on alignment between word embeddings. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers (COLING’16). 1147--1158.Google Scholar
- Sun Kim, Nicolas Fiorini, W. John Wilbur, and Zhiyong Lu. 2017. Bridging the gap: Incorporating a semantic similarity measure for effectively mapping PubMed queries to documents. J. Biomed. Inf. 75 (2017), 122--127.Google ScholarDigital Library
- Yoon Kim. 2014. Convolutional neural networks for sentence classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1746--1751.Google ScholarCross Ref
- Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, and Radu Soricut. 2019. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the International Conference on Learning Representations.Google Scholar
- Thomas K. Landauer and Susan T. Dumais. 1997. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 2 (1997), 211.Google ScholarCross Ref
- Thomas K. Landauer, Peter W. Foltz, and Darrell Laham. 1998. An introduction to latent semantic analysis. Discour. Proc. 25, 2--3 (1998), 259--284.Google ScholarCross Ref
- Juan J. Lastra-Díaz and Ana García-Serrano. 2015. A new family of information content models with an experimental survey on WordNet. Knowl.-based Syst. 89 (2015), 509--526.Google Scholar
- Juan J. Lastra-Díaz, Ana García-Serrano, Montserrat Batet, Miriam Fernández, and Fernando Chirigati. 2017. HESML: A scalable ontology-based semantic similarity measures library with a set of reproducible experiments and a replication dataset. Inf. Syst. 66 (2017), 97--118.Google ScholarDigital Library
- Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana García-Serrano, Mohamed Ben Aouicha, and Eneko Agirre. 2019. A reproducible survey on word embeddings and ontology-based methods for word similarity: Linear combinations outperform the state of the art. Eng. Applic. Artif. Intell. 85 (2019), 645--665. DOI:https://doi.org/10.1016/j.engappai.2019.07.010Google ScholarCross Ref
- Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In Proceedings of the International Conference on Machine Learning. 1188--1196.Google ScholarDigital Library
- Yuquan Le, Zhi-Jie Wang, Zhe Quan, Jiawei He, and Bin Yao. 2018. ACV-tree: A new method for sentence similarity modeling. In Proceedings of the International Joint Conference on Artificial Intelligence. 4137--4143.Google ScholarCross Ref
- Ming Che Lee. 2011. A novel sentence similarity measure for semantic-based expert systems. Exp. Syst. Applic. 38, 5 (2011), 6392--6399.Google ScholarDigital Library
- Omer Levy and Yoav Goldberg. 2014. Dependency-based word embeddings. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 302--308.Google ScholarCross Ref
- Omer Levy and Yoav Goldberg. 2014. Neural word embedding as implicit matrix factorization. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2177--2185.Google Scholar
- Peipei Li, Haixun Wang, Kenny Q. Zhu, Zhongyuan Wang, and Xindong Wu. 2013. Computing term similarity by large probabilistic isA knowledge. In Proceedings of the 22nd ACM International Conference on Information 8 Knowledge Management. 1401--1410.Google ScholarDigital Library
- Yuhua Li, Zuhair A. Bandar, and David McLean. 2003. An approach for measuring semantic similarity between words using multiple information sources. IEEE Trans. Knowl. Data Eng. 15, 4 (2003), 871--882.Google ScholarDigital Library
- Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett. 2006. Sentence similarity based on semantic nets and corpus statistics. IEEE Trans. Knowl. Data Eng. 18, 8 (2006), 1138--1150.Google ScholarDigital Library
- Dekang Lin et al. 1998. An information-theoretic definition of similarity. In Proceedings of the International Conference on Machine Learning (ICML’98). 296--304.Google Scholar
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A robustly optimized BERT pretraining approach. Arxiv Preprint Arxiv:1907.11692 (2019).Google Scholar
- I. Lopez-Gazpio, M. Maritxalar, A. Gonzalez-Agirre, G. Rigau, L. Uria, and E. Agirre. 2017. Interpretable semantic textual similarity: Finding and explaining differences between sentences. Knowl.-based Syst. 119 (2017), 186--199. DOI:https://doi.org/10.1016/j.knosys.2016.12.013Google Scholar
- I. Lopez-Gazpio, M. Maritxalar, M. Lapata, and E. Agirre. 2019. Word n-gram attention models for sentence similarity and inference. Exp. Syst. Applic. 132 (2019), 1--11. DOI:https://doi.org/10.1016/j.eswa.2019.04.054Google ScholarCross Ref
- Kevin Lund and Curt Burgess. 1996. Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Meth. Instrum. Comput. 28, 2 (1996), 203--208.Google ScholarCross Ref
- M. Marelli, S. Menini, M. Baroni, L. Bentivogli, R. Bernardi, and R. Zamparelli. 2014. A SICK cure for the evaluation of compositional distributional semantic models. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), Reykjavik, Iceland, 216--223. http://www.lrec-conf.org/proceedings/lrec2014/pdf/363_Paper.pdf.Google Scholar
- Bryan McCann, James Bradbury, Caiming Xiong, and Richard Socher. 2017. Learned in translation: Contextualized word vectors. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran 8 Associates Inc., 6297--6308.Google Scholar
- Bridget T. McInnes, Ying Liu, Ted Pedersen, Genevieve B. Melton, and Serguei V. Pakhomov. 2013. UMLS: Similarity: Measuring the relatedness and similarity of biomedical concepts. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 28.Google Scholar
- Christopher Meek, Yang Yi, and Yih Wen-tau. 2018. WIKIQA: A challenge dataset for open-domain question answering. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing September 2015. 2013--2018. https://doi.org/10.18653/v1/D15-1237Google Scholar
- Oren Melamud, Jacob Goldberger, and Ido Dagan. 2016. context2vec: Learning generic context embedding with bidirectional LSTM. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. 51--61.Google ScholarCross Ref
- Rada Mihalcea and Andras Csomai. 2007. Wikify! Linking documents to encyclopedic knowledge. In Proceedings of the 16th ACM Conference on Information and Knowledge Management. 233--242.Google Scholar
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. Arxiv Preprint Arxiv:1301.3781 (2013).Google Scholar
- Tomáš Mikolov, Wen-tau Yih, and Geoffrey Zweig. 2013. Linguistic regularities in continuous space word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 746--751.Google Scholar
- George A. Miller. 1995. WordNet: A lexical database for English. Commun. ACM 38, 11 (1995), 39--41.Google ScholarDigital Library
- George A. Miller and Walter G. Charles. 1991. Contextual correlates of semantic similarity. Lang. Cog. Proc. 6, 1 (1991), 1--28.Google ScholarCross Ref
- Andriy Mnih and Koray Kavukcuoglu. 2013. Learning word embeddings efficiently with noise-contrastive estimation. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 2265--2273.Google Scholar
- Muhidin Mohamed and Mourad Oussalah. 2019. SRL-ESA-TextSum: A text summarization approach based on semantic role labeling and explicit semantic analysis. Inf. Proc. Manag. 56, 4 (2019), 1356--1372.Google ScholarCross Ref
- Saif M. Mohammad and Graeme Hirst. 2012. Distributional measures of semantic distance: A survey. Arxiv Preprint Arxiv:1203.1858 (2012).Google Scholar
- Alessandro Moschitti. 2006. Efficient convolution kernels for dependency and constituent syntactic trees. In Proceedings of the European Conference on Machine Learning. Springer, 318--329.Google ScholarDigital Library
- Alessandro Moschitti. 2008. Kernel methods, syntax and semantics for relational text categorization. In Proceedings of the 17th ACM Conference on Information and Knowledge Management. 253--262.Google ScholarDigital Library
- Alessandro Moschitti, Daniele Pighin, and Roberto Basili. 2008. Tree kernels for semantic role labeling. Comput. Ling. 34, 2 (2008), 193--224.Google ScholarDigital Library
- Alessandro Moschitti and Silvia Quarteroni. 2008. Kernels on linguistic structures for answer extraction. In Proceedings of the Conference of the Association for Computational Linguistics: Human Language Technologies, Short Papers. 113--116.Google ScholarDigital Library
- Alessandro Moschitti, Silvia Quarteroni, Roberto Basili, and Suresh Manandhar. 2007. Exploiting syntactic and shallow semantic kernels for question answer classification. In Proceedings of the 45th Meeting of the Association of Computational Linguistics. 776--783.Google Scholar
- Alessandro Moschitti and Fabio Massimo Zanzotto. 2007. Fast and effective kernels for relational learning from texts. In Proceedings of the 24th International Conference on Machine Learning. 649--656.Google ScholarDigital Library
- Roberto Navigli and Simone Paolo Ponzetto. 2012. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193 (2012).Google Scholar
- Douglas L. Nelson, Cathy L. McEvoy, and Thomas A. Schreiber. 2004. The University of South Florida free association, rhyme, and word fragment norms. Behav. Res. Meth. Instrum. Comput. 36, 3 (2004), 402--407.Google ScholarCross Ref
- Joakim Nivre. 2006. Inductive Dependency Parsing. Springer.Google Scholar
- Matteo Pagliardini, Prakhar Gupta, and Martin Jaggi. 2018. Unsupervised learning of sentence embeddings using compositional n-gram features. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). 528--540.Google ScholarCross Ref
- Ankur Parikh, Oscar Täckström, Dipanjan Das, and Jakob Uszkoreit. 2016. A decomposable attention model for natural language inference. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 2249--2255.Google ScholarCross Ref
- A. Pawar and V. Mago. 2019. Challenging the boundaries of unsupervised learning for semantic similarity. IEEE Access 7 (2019), 16291--16308.Google ScholarCross Ref
- Ted Pedersen, Serguei V. S. Pakhomov, Siddharth Patwardhan, and Christopher G. Chute. 2007. Measures of semantic similarity and relatedness in the biomedical domain. J. Biomed. Inf. 40, 3 (2007), 288--299.Google ScholarDigital Library
- Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 1532--1543.Google Scholar
- Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2227--2237.Google ScholarCross Ref
- Mohammad Taher Pilehvar and Jose Camacho-Collados. 2019. WiC: the word-in-context dataset for evaluating context-sensitive meaning representations. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 1267--1273.Google Scholar
- Mohammad Taher Pilehvar, David Jurgens, and Roberto Navigli. 2013. Align, disambiguate and walk: A unified approach for measuring semantic similarity. In Proceedings of the 51st Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1341--1351.Google Scholar
- Mohammad Taher Pilehvar and Roberto Navigli. 2015. From senses to texts: An all-in-one graph-based approach for measuring semantic similarity. Artif. Intell. 228 (2015), 95--128. DOI:https://doi.org/10.1016/j.artint.2015.07.005.Google ScholarDigital Library
- Rong Qu, Yongyi Fang, Wen Bai, and Yuncheng Jiang. 2018. Computing semantic similarity based on novel models of semantic representation using Wikipedia. Inf. Proc. Manag. 54, 6 (2018), 1002--1021. DOI:https://doi.org/10.1016/j.ipm.2018.07.002.Google ScholarCross Ref
- Z. Quan, Z. Wang, Y. Le, B. Yao, K. Li, and J. Yin. 2019. An efficient framework for sentence similarity modeling. IEEE/ACM Trans. Aud. Speech Lang. Proc. 27, 4 (Apr. 2019), 853--865. DOI:https://doi.org/10.1109/TASLP.2019.2899494.Google Scholar
- Roy Rada, Hafedh Mili, Ellen Bicknell, and Maria Blettner. 1989. Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cyber. 19, 1 (1989), 17--30.Google ScholarCross Ref
- Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. Arxiv Preprint Arxiv:1910.10683 (2019).Google Scholar
- Philip Resnik. 1995. Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence. 448--453.Google ScholarDigital Library
- M. Andrea Rodríguez and Max J. Egenhofer. 2003. Determining semantic similarity among entity classes from different ontologies. IEEE Trans. Knowl. Data Eng. 15, 2 (2003), 442--456.Google ScholarDigital Library
- Terry Ruas, William Grosky, and Akiko Aizawa. 2019. Multi-sense embeddings through a word sense disambiguation process. Exp. Syst. Applic. 136 (2019), 288--303. DOI:https://doi.org/10.1016/j.eswa.2019.06.026Google ScholarCross Ref
- Herbert Rubenstein and John B. Goodenough. 1965. Contextual correlates of synonymy. Commun. ACM 8, 10 (1965), 627--633.Google ScholarDigital Library
- David Sánchez, Montserrat Batet, and David Isern. 2011. Ontology-based information content computation. Knowl.-based Syst. 24, 2 (2011), 297--303.Google Scholar
- Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2019. DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. Arxiv Preprint Arxiv:1910.01108 (2019).Google Scholar
- Frane Šarić, Goran Glavaš, Mladen Karan, Jan Šnajder, and Bojana Dalbelo Bašić. 2012. Takelab: Systems for measuring semantic text similarity. In * SEM 2012: The 1st Joint Conference on Lexical and Computational Semantics--Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the 6th International Workshop on Semantic Evaluation (SemEval’12). 441--448.Google Scholar
- Tobias Schnabel, Igor Labutov, David Mimno, and Thorsten Joachims. 2015. Evaluation methods for unsupervised word embeddings. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 298--307.Google ScholarCross Ref
- Aliaksei Severyn and Alessandro Moschitti. 2012. Structural relationships for large-scale learning of answer re-ranking. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. 741--750.Google ScholarDigital Library
- Aliaksei Severyn, Massimo Nicosia, and Alessandro Moschitti. 2013. Learning semantic textual similarity with structural representations. In Proceedings of the 51st Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 714--718.Google Scholar
- Yang Shao. 2017. HCTI at SemEval-2017 task 1: Use convolutional neural network to evaluate semantic textual similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval’17). 130--133.Google ScholarCross Ref
- John Shawe-Taylor, Nello Cristianini et al. 2004. Kernel Methods for Pattern Analysis. Cambridge University Press.Google Scholar
- Carina Silberer and Mirella Lapata. 2014. Learning grounded meaning representations with autoencoders. In Proceedings of the 52nd Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 721--732.Google ScholarCross Ref
- Roberta A. Sinoara, Jose Camacho-Collados, Rafael G. Rossi, Roberto Navigli, and Solange O. Rezende. 2019. Knowledge-enhanced document embeddings for text classification. Knowl.-based Syst. 163 (2019), 955--971. DOI:https://doi.org/10.1016/j.knosys.2018.10.026Google Scholar
- Gizem Soǧancıoǧlu, Hakime Öztürk, and Arzucan Özgür. 2017. BIOSSES: A semantic sentence similarity estimation system for the biomedical domain. Bioinformatics 33, 14 (07 2017), i49--i58. arXiv: Retrieved from https://academic.oup.com/bioinformatics/article-pdf/33/14/i49/2515 7316/btx238.pdf.Google Scholar
- Md Arafat Sultan, Steven Bethard, and Tamara Sumner. 2014. DLS@ CU: Sentence similarity from word alignment. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval’14). 241--246.Google ScholarCross Ref
- Md Arafat Sultan, Steven Bethard, and Tamara Sumner. 2015. DLS@ CU: Sentence similarity from word alignment and semantic vector composition. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval’15). 148--153.Google ScholarCross Ref
- Yu Sun, Shuohuan Wang, Yu-Kun Li, Shikun Feng, Hao Tian, Hua Wu, and Haifeng Wang. 2020. ERNIE 2.0: A continual pre-training framework for language understanding. In Proceedings of the AAAI Conference on Artificial Intelligence. 8968--8975.Google ScholarCross Ref
- David Sánchez and Montserrat Batet. 2013. A semantic similarity method based on information content exploiting multiple ontologies. Exp. Syst. Applic. 40, 4 (2013), 1393--1399. DOI:https://doi.org/10.1016/j.eswa.2012.08.049Google ScholarDigital Library
- David Sánchez, Montserrat Batet, David Isern, and Aida Valls. 2012. Ontology-based semantic similarity: A new feature-based approach. Exp. Syst. Applic. 39, 9 (2012), 7718--7728. DOI:https://doi.org/10.1016/j.eswa.2012.01.082Google ScholarDigital Library
- Kai Sheng Tai, Richard Socher, and Christopher D. Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 1556--1566.Google Scholar
- Junfeng Tian, Zhiheng Zhou, Man Lan, and Yuanbin Wu. 2017. ECNU at SemEval-2017 task 1: Leverage kernel-based traditional NLP features and neural networks to build a universal model for multilingual and cross-lingual semantic textual similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval’17). 191--197.Google ScholarCross Ref
- Nguyen Huy Tien, Nguyen Minh Le, Yamasaki Tomohiro, and Izuha Tatsuya. 2019. Sentence modeling via multiple word embeddings and multi-level comparison for semantic textual similarity. Inf. Proc. Manag. 56, 6 (2019), 102090. DOI:https://doi.org/10.1016/j.ipm.2019.102090Google ScholarCross Ref
- Julien Tissier, Christophe Gravier, and Amaury Habrard. 2017. Dict2vec: Learning Word embeddings using lexical dictionaries. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’17). 254--263.Google ScholarCross Ref
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of the International Conference on Advances in Neural Information Processing Systems.Google Scholar
- Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. 2007. What is the jeopardy model? A quasi-synchronous grammar for QA. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). 22--32.Google Scholar
- Zhiguo Wang, Haitao Mi, and Abraham Ittycheriah. 2016. Sentence similarity learning by lexical decomposition and composition. In Proceedings of the 26th International Conference on Computational Linguistics (COLING’16). 1340--1349. arXiv:1602.07019.Google Scholar
- Zhibiao Wu and Martha Palmer. 1994. Verbs semantics and lexical selection. In Proceedings of the 32nd Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 133--138.Google ScholarDigital Library
- Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R. Salakhutdinov, and Quoc V. Le. 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Proceedings of the International Conference on Advances in Neural Information Processing Systems. 5753--5763.Google Scholar
- G. Zhu and C. A. Iglesias. 2017. Computing semantic similarity of concepts in knowledge graphs. IEEE Trans. Knowl. Data Eng. 29, 1 (Jan. 2017), 72--85. DOI:https://doi.org/10.1109/TKDE.2016.2610428Google ScholarDigital Library
- Will Y. Zou, Richard Socher, Daniel Cer, and Christopher D. Manning. 2013. Bilingual word embeddings for phrase-based machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 1393--1398.Google Scholar
Index Terms
- Evolution of Semantic Similarity—A Survey
Recommendations
Hybrid Method for Semantic Similarity Computation Using Weighted Components in Ontology
In this paper, the researchers propose an approach to measure the semantic similarity between two concepts in an ontology like WordNet and DBpedia. Some earlier semantic similarity approaches proposed concentrated on the ontology structure between ...
Measuring Word Semantic Similarity Based on Transferred Vectors
Neural Information ProcessingAbstractSemantic similarity between words has now become a popular research problem to tackle in natural language processing (NLP) field. Word embedding have been demonstrated progress in measuring word similarity recently. However, limited to the ...
Evaluating semantic similarity and relatedness over the semantic grouping of clinical term pairs
Display Omitted Objective: develop a method to quantify the similarity and relatedness of biomedical and clinical term pairs.Semantic similarity and relatedness measures exploit information extrapolated from the Unified Medical Language System.Evaluates ...
Comments