Abstract
Agile Business often requires to identify similar objects (firms, providers, end users, products) between an older business domain and a newer one. Data-driven tools for aggregating similar resources are nowadays often used in Business Intelligence applications, and a large majority of them involve Machine Learning techniques based on similarity metrics. However effective, the mathematics such tools are based on does not lend itself to human-readable explanations of their results, leaving a manager using them in a “take it as is”-or-not dilemma. To increase trust in such tools, we propose and implement a general method to explain the similarity of a given group of RDF resources. Our tool is based on the theory of Least Common Subsumers (LCS), and can be applied to every domain requiring the comparison of RDF resources, including business organizations. Given a set of RDF resources found to be similar by Data-driven tools, we first compute the LCS of the resources, which is a generic RDF resource describing the features shared by the group recursively—i.e., at any depth in feature paths. Subsequently, we translate the LCS in English common language. Being agnostic to the aggregation criteria, our implementation can be pipelined with every other aggregation tool. To prove this, we cascade an implementation of our method to (i) the comparison of contracting processes in Public Procurement (using TheyBuyForYou), and (ii) the comparison and clustering of drugs (using k-Means) in Drugbank. For both applications, we present a fairly readable description of the commonalities of the cluster given as input.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
References
Bouayad-Agha, N., Casamayor, G., Wanner, L.: Natural language generation in the context of the semantic web. Semant. Web 5(6), 493–513 (2014)
Cimiano, P., Lüker, J., Nagel, D., Unger, C.: Exploiting ontology lexica for generating natural language texts from RDF data. In: Proceedings of the 14th European Workshop on Natural Language Generation, Sofia, Bulgaria, pp. 10–19. Association for Computational Linguistics, August 2013. https://aclanthology.org/W13-2102
Colin, E., Gardent, C., M’rabet, Y., Narayan, S., Perez-Beltrachini, L.: The webNLG challenge: generating text from DBpedia data. In: Proceedings of the 9th International Natural Language Generation Conference, pp. 163–167 (2016)
Colucci, S., Donini, F., Giannini, S., Di Sciascio, E.: Defining and computing least common subsumers in RDF. Web Semant. Sci. Serv. Agents World Wide Web 39, 62–80 (2016)
Colucci, S., Donini, F.M., Di Sciascio, E.: Common subsumbers in RDF. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds.) AI*IA 2013. LNCS (LNAI), vol. 8249, pp. 348–359. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-03524-6_30
Colucci, S., Giannini, S., Donini, F.M., Di Sciascio, E.: A deductive approach to the identification and description of clusters in linked open data. In: Proceedings of the 21st European Conference on Artificial Intelligence (ECAI 2014). IOS Press (2014)
Ghosal, A., Nandy, A., Das, A.K., Goswami, S., Panday, M.: A short review on different clustering techniques and their applications. In: Mandal, J.K., Bhattacharya, D. (eds.) Emerging Technology in Modelling and Graphics. AISC, vol. 937, pp. 69–83. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-7403-6_9
Hayes, P., Patel-Schneider, P.F.: RDF 1.1 semantics, W3C recommendation (2014). www.w3.org/TR/2014/REC-rdf11-mt-20140225/
Huang, L., Luo, H., Li, S., Wu, F.X., Wang, J.: Drug-drug similarity measure and its applications. Briefings Bioinform. 22(4) (2020)
Li, J., Zhang, Y., Qian, C., Ma, S., Zhang, G.: Research on recommendation and interaction strategies based on resource similarity in the manufacturing ecosystem. Adv. Eng. Inform. 46, 101183 (2020). www.sciencedirect.com/science/article/pii/S1474034620301543
Li, J., et al.: Neural entity summarization with joint encoding and weak supervision. In: Bessiere, C. (ed.) Proceedings of IJCAI-2020, pp. 1644–1650. ijcai.org (2020). https://doi.org/10.24963/ijcai.2020/228
Michalski, R.S.: Knowledge acquisition through conceptual clustering: a theoretical framework and an algorithm for partitioning data into conjunctive concepts. Int. J. Policy Anal. Inf. Syst. 4, 219–244 (1980)
Pérez-Suárez, A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A.: A review of conceptual clustering algorithms. Artif. Intell. Rev. 52(2), 1267–1296 (2019). https://doi.org/10.1007/s10462-018-9627-1
Saxena, A., et al.: A review of clustering techniques and developments. Neurocomputing 267, 664–681 (2017)
Soylu, A., et al.: TheyBuyForYou platform and knowledge graph: expanding horizons in public procurement with open linked data. Semant. Web 13(2), 265–291 (2022)
Soylu, A., et al.: Towards an ontology for public procurement based on the open contracting data standard. In: Pappas, I.O., Mikalef, P., Dwivedi, Y.K., Jaccheri, L., Krogstie, J., Mäntymäki, M. (eds.) I3E 2019. LNCS, vol. 11701, pp. 230–237. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29374-1_19
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS 2014, vol. 2, pp. 3104–3112, Cambridge, MA, USA. MIT Press (2014)
Vougiouklis, P., et al.: Neural Wikipedian: generating textual summaries from knowledge base triples. J. Web Semant. 52–53, 1–15 (2018). www.sciencedirect.com/science/article/pii/S1570826818300313
Wishart, D.S., et al.: DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 36(suppl 1), D901–D906 (2008)
Yu, Y., Umashankar, N., Rao, V.R.: Choosing the right target: relative preferences for resource similarity and complementarity in acquisition choice. Strat. Manag. J. 37(8), 1808–1825 (2016). https://onlinelibrary.wiley.com/doi/abs/10.1002/smj.2416
Zhou, G., Lampouras, G.: WebNLG challenge 2020: language agnostic delexicalisation for multilingual RDF-to-text generation. In: Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+), Dublin, Ireland (Virtual), pp. 186–191. Association for Computational Linguistics, December 2020. https://aclanthology.org/2020.webnlg-1.22
Acknowledgements
Projects Regione Lazio-DTC/“SanLo” (CUP F85F21001090003) and MISE (FSC 2014–2020)/“BARIUM5G” (CUP D94I20000160002) partially supported this work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Colucci, S., Donini, F.M., Iurilli, N., Di Sciascio, E. (2022). A Business Intelligence Tool for Explaining Similarity. In: Babkin, E., Barjis, J., Malyzhenkov, P., Merunka, V. (eds) Model-Driven Organizational and Business Agility. MOBA 2022. Lecture Notes in Business Information Processing, vol 457. Springer, Cham. https://doi.org/10.1007/978-3-031-17728-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-17728-6_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17727-9
Online ISBN: 978-3-031-17728-6
eBook Packages: Computer ScienceComputer Science (R0)