Skip to main content

How Similar Is It? Towards Personalized Similarity Measures in Ontologies

  • Conference paper
Wirtschaftsinformatik 2005

Abstract

Finding a good similarity assessment algorithm for the use in ontologies is central to the functioning of techniques such as retrieval, matchmaking, clustering, data-mining, ontology translations, automatic database schema matching, and simple object comparisons. This paper assembles a catalogue of ontology based similarity measures, which are experimentally compared with a “similarity gold standard” obtained by surveying 50 human subjects. Results show that human and algorithmic similarity predications varied substantially, but could be grouped into cohesive clusters. Addressing this variance we present a personalized similarity assessment procedure, which uses a machine learning component to predict a subject’s cluster membership, providing an excellent prediction of the gold standard. We conclude by hypothesizing ontology dependent similarity measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 269.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [And+03]_Andreasen, T.; H. Bulskov; Knappe, R: From Ontology over Similarity to Query Evaluation. 2nd CoLogNET-ElsNET Symposium — Questions and Answers: Theoretical and Applied Perspectives. Amsterdam, Holland, 2003: pp. 39–50.

    Google Scholar 

  2. Baeza-Yates, R.; Ribeiro-Neto, B.: Modern Information Retrieval. ACM Press; Addison-Wesley: New York; Harlow, England; Reading, Mass., 1999.

    Google Scholar 

  3. [Ber+04]_Bernstein, A.; Kaufmann, E.; Bürki, C; Klein, M.: Object Similarity in Ontologies: A Foundation for Business Intelligence Systems and High-performance Retrieval. Twenty-Fifth International Conference on Information Systems. Washington, DC, 2004: pp. 11–25.

    Google Scholar 

  4. [Blo+02]_Blok, S.; Medin, D.; Osherson, D.: Probability from Similarity. AAAI Conference on Commonsense Reasoning. AAAI Press: Stanford, CA, 2002.

    Google Scholar 

  5. Brin, S.; Page, L.: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Seventh International World Wide Web Conference. ACM-Press: Brisbane, Australia, 1998.

    Google Scholar 

  6. Budanitsky, A.; Hirst, G.: Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures. Second meeting of the North American Chapter of the Association for Computational Linguistics (NAACL2001). Pittsburgh, PA, 2001.

    Google Scholar 

  7. [DLN+03]_Di Noia, T.; Di Sciascio, E.; Donini, F. M.; Mongiello, M.: A System for Principled Matchmaking in an Electronic Marketplace. WWW2003. Budapest, Hungary, 2003.

    Google Scholar 

  8. Dzeroski, S.; Lavrac, N.: Relational Data Mining. Springer: Berlin; New York, 2001.

    Book  Google Scholar 

  9. Gentner, D.; Medina, J.: Similarity and the Development of Rules. Cognition 65, 1998: pp. 263–297.

    Article  Google Scholar 

  10. Grosof, B.; Poon, T. C: Representing Agent Contracts with Exceptions using XML Rules, Ontologies, and Process Descriptions. International Workshop on Rule Markup Languages for Business Rules on the Semantic Web (held at ISWC2002). Sardinia, Italy, 2002.

    Google Scholar 

  11. Jarmasz, M.; Szpakowicz, S.: Roget’s Thesaurus and Semantic Similarity. International Conference on Recent Advances in Natural Language Processing (RANLP2003). Borovets, Bulgaria, 2003.

    Google Scholar 

  12. Jiang, J. J.; Conrath, D. W.: Semantic Similairty Based on Corpus Statistics and Lexical Taxonomy. International Conference on Research on Computational Linguistics (ROCLING X). Taiwan, 1997.

    Google Scholar 

  13. Klein, M.; Bernstein, A.: Towards High-Precision Service Retrieval. IEEE Internet Computing 8, 2004: pp. 30–36.

    Article  Google Scholar 

  14. [Lee+93]_Lee, J. H.; Kim, M. H.; Lee, Y. J.: Information Retrieval Based on Conceptual Distance in IS-A Hierarchies. Journal of Documentation 49, 1993: pp. 188–207.

    Article  Google Scholar 

  15. Levenshtein, V. L: Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady 10, 1966: pp. 707–710.

    Google Scholar 

  16. Lin, D.: An Information-Theoretic Definition of Similarity. Fifteenth International Conference on Machine Learning (ICML‘98). Morgan-Kaufmann: Madison, WI, 1998.

    Google Scholar 

  17. [Lor+03]_Lord, P. W.; Stevens, R. D.; Brass, A.; Goble, C. A.: Investigating semantic similarity measures across the Gene Ontology: The relationship between sequence and annotation. Bioinformatics 19, 2003: pp. 1275–1283.

    Article  Google Scholar 

  18. [Mal+03]_Malone, T. W., K. Crowston, and G. A. Herman (Eds.). 2003. Organizing Business Knowledge: The MIT Process Handbook. Cambridge, MA: MIT Press. 2003.

    Google Scholar 

  19. [Mal+99]_Malone, T. W.; Crowston, K.; Lee, J.; Pentland, B.; Dellarocaset C. et al.: Tools for inventing organizations: Toward a handbook of organizational processes. Management Science 45, 1999: pp. 425–443.

    Article  Google Scholar 

  20. McCallum, A. K.: Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. Unpublished manuscript (available online at: http://www.cs.cmu.edu/~mccallum/bow), 1996, Download: 2004-04-15.

    Google Scholar 

  21. [Mil++93]_Miller, G. A.; Beckwith, R.; Fellbaum, C; Gross, D.; Miller, K.: Introduction to WordNet: An On-line Lexical Database. Technical Report. Cognitive Science Laboratory, Princeton University, Princeton, NJ, 1993.

    Google Scholar 

  22. Miller, G. A.; Charles, W. G.: Contextual Correlates of Semantic Similarity. Language and Cognitive Processes 6, 1991: pp. 1–28.

    Article  Google Scholar 

  23. Mitchell, T. M.: Machine Learning. McGraw-Hill: New York, 1997.

    Google Scholar 

  24. Ouzzani, M.; Bouguettaya, A.: Efficient Access to Web Services. IEEE Xplore: Internet Computing 8, 2004: pp. 34–44.

    Google Scholar 

  25. Quinlan, J. R.: C4.5: Programs for Machine Learning. Morgan Kaufmann: San Mateo, CA, 1993.

    Google Scholar 

  26. [Rad+89]_Rada, R.; Mili, H.; Bicknell, E.; Bletner, M.: Development and Application of a Metric on Semantic Nets. IEEE Transactions on Systems, Man, and Cybernetics 19, 1989: pp. 17–30.

    Article  Google Scholar 

  27. Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. 14th International Joint Conference on Artificial Intelligence. Montreal 1995: pp. 448–453.

    Google Scholar 

  28. Resnik, P.: Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research 11, 1999: pp. 95–130.

    Google Scholar 

  29. Rodriguez, M. A.; Egenhofer, M. J.: Determining Semantic Similarity Among Entity Classes from Different Ontologies. IEEE Transactions on Knowledge and Data Engineering 15, 2003: pp. 442–456.

    Article  Google Scholar 

  30. Sachs, L.: Angewandte Statistik. Springer: Berlin, 2002.

    Book  Google Scholar 

  31. Salton, G.; McGill, M. J.: Introduction to modern information retrieval. McGraw-Hill: New York, 1983.

    Google Scholar 

  32. Witten, I. H.; Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan-Kaufmann: San Francisco, 2000.

    Google Scholar 

  33. Wu, Z.; Palmer, M.: Verb Semantics and Lexical Selection. 32nd Annual Meeting of the Associations for Computational Linguistics. Las Cruces, New Mexico, 1994: pp. 133–138

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Physica-Verlag Heidelberg

About this paper

Cite this paper

Bernstein, A., Kaufmann, E., Bürki, C., Klein, M. (2005). How Similar Is It? Towards Personalized Similarity Measures in Ontologies. In: Ferstl, O.K., Sinz, E.J., Eckert, S., Isselhorst, T. (eds) Wirtschaftsinformatik 2005. Physica, Heidelberg. https://doi.org/10.1007/3-7908-1624-8_71

Download citation

Publish with us

Policies and ethics