A Business Intelligence Tool for Explaining Similarity

Colucci, Simona; Donini, Francesco M.; Iurilli, Nicola; Di Sciascio, Eugenio

doi:10.1007/978-3-031-17728-6_5

Simona Colucci¹⁰,
Francesco M. Donini¹¹,
Nicola Iurilli¹⁰ &
…
Eugenio Di Sciascio¹⁰

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 457))

Included in the following conference series:

International Workshop on Model-Driven Organizational and Business Agility

258 Accesses
1 Citations

Abstract

Agile Business often requires to identify similar objects (firms, providers, end users, products) between an older business domain and a newer one. Data-driven tools for aggregating similar resources are nowadays often used in Business Intelligence applications, and a large majority of them involve Machine Learning techniques based on similarity metrics. However effective, the mathematics such tools are based on does not lend itself to human-readable explanations of their results, leaving a manager using them in a “take it as is”-or-not dilemma. To increase trust in such tools, we propose and implement a general method to explain the similarity of a given group of RDF resources. Our tool is based on the theory of Least Common Subsumers (LCS), and can be applied to every domain requiring the comparison of RDF resources, including business organizations. Given a set of RDF resources found to be similar by Data-driven tools, we first compute the LCS of the resources, which is a generic RDF resource describing the features shared by the group recursively—i.e., at any depth in feature paths. Subsequently, we translate the LCS in English common language. Being agnostic to the aggregation criteria, our implementation can be pipelined with every other aggregation tool. To prove this, we cascade an implementation of our method to (i) the comparison of contracting processes in Public Procurement (using TheyBuyForYou), and (ii) the comparison and clustering of drugs (using k-Means) in Drugbank. For both applications, we present a fairly readable description of the commonalities of the cluster given as input.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bouayad-Agha, N., Casamayor, G., Wanner, L.: Natural language generation in the context of the semantic web. Semant. Web 5(6), 493–513 (2014)
Article Google Scholar
Cimiano, P., Lüker, J., Nagel, D., Unger, C.: Exploiting ontology lexica for generating natural language texts from RDF data. In: Proceedings of the 14th European Workshop on Natural Language Generation, Sofia, Bulgaria, pp. 10–19. Association for Computational Linguistics, August 2013. https://aclanthology.org/W13-2102
Colin, E., Gardent, C., M’rabet, Y., Narayan, S., Perez-Beltrachini, L.: The webNLG challenge: generating text from DBpedia data. In: Proceedings of the 9th International Natural Language Generation Conference, pp. 163–167 (2016)
Google Scholar
Colucci, S., Donini, F., Giannini, S., Di Sciascio, E.: Defining and computing least common subsumers in RDF. Web Semant. Sci. Serv. Agents World Wide Web 39, 62–80 (2016)
Article Google Scholar
Colucci, S., Donini, F.M., Di Sciascio, E.: Common subsumbers in RDF. In: Baldoni, M., Baroglio, C., Boella, G., Micalizio, R. (eds.) AI*IA 2013. LNCS (LNAI), vol. 8249, pp. 348–359. Springer, Cham (2013). https://doi.org/10.1007/978-3-319-03524-6_30
Chapter Google Scholar
Colucci, S., Giannini, S., Donini, F.M., Di Sciascio, E.: A deductive approach to the identification and description of clusters in linked open data. In: Proceedings of the 21st European Conference on Artificial Intelligence (ECAI 2014). IOS Press (2014)
Google Scholar
Ghosal, A., Nandy, A., Das, A.K., Goswami, S., Panday, M.: A short review on different clustering techniques and their applications. In: Mandal, J.K., Bhattacharya, D. (eds.) Emerging Technology in Modelling and Graphics. AISC, vol. 937, pp. 69–83. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-7403-6_9
Chapter Google Scholar
Hayes, P., Patel-Schneider, P.F.: RDF 1.1 semantics, W3C recommendation (2014). www.w3.org/TR/2014/REC-rdf11-mt-20140225/
Huang, L., Luo, H., Li, S., Wu, F.X., Wang, J.: Drug-drug similarity measure and its applications. Briefings Bioinform. 22(4) (2020)
Google Scholar
Li, J., Zhang, Y., Qian, C., Ma, S., Zhang, G.: Research on recommendation and interaction strategies based on resource similarity in the manufacturing ecosystem. Adv. Eng. Inform. 46, 101183 (2020). www.sciencedirect.com/science/article/pii/S1474034620301543
Li, J., et al.: Neural entity summarization with joint encoding and weak supervision. In: Bessiere, C. (ed.) Proceedings of IJCAI-2020, pp. 1644–1650. ijcai.org (2020). https://doi.org/10.24963/ijcai.2020/228
Michalski, R.S.: Knowledge acquisition through conceptual clustering: a theoretical framework and an algorithm for partitioning data into conjunctive concepts. Int. J. Policy Anal. Inf. Syst. 4, 219–244 (1980)
Google Scholar
Pérez-Suárez, A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A.: A review of conceptual clustering algorithms. Artif. Intell. Rev. 52(2), 1267–1296 (2019). https://doi.org/10.1007/s10462-018-9627-1
Article Google Scholar
Saxena, A., et al.: A review of clustering techniques and developments. Neurocomputing 267, 664–681 (2017)
Article Google Scholar
Soylu, A., et al.: TheyBuyForYou platform and knowledge graph: expanding horizons in public procurement with open linked data. Semant. Web 13(2), 265–291 (2022)
Article Google Scholar
Soylu, A., et al.: Towards an ontology for public procurement based on the open contracting data standard. In: Pappas, I.O., Mikalef, P., Dwivedi, Y.K., Jaccheri, L., Krogstie, J., Mäntymäki, M. (eds.) I3E 2019. LNCS, vol. 11701, pp. 230–237. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29374-1_19
Chapter Google Scholar
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS 2014, vol. 2, pp. 3104–3112, Cambridge, MA, USA. MIT Press (2014)
Google Scholar
Vougiouklis, P., et al.: Neural Wikipedian: generating textual summaries from knowledge base triples. J. Web Semant. 52–53, 1–15 (2018). www.sciencedirect.com/science/article/pii/S1570826818300313
Wishart, D.S., et al.: DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res. 36(suppl 1), D901–D906 (2008)
Article Google Scholar
Yu, Y., Umashankar, N., Rao, V.R.: Choosing the right target: relative preferences for resource similarity and complementarity in acquisition choice. Strat. Manag. J. 37(8), 1808–1825 (2016). https://onlinelibrary.wiley.com/doi/abs/10.1002/smj.2416
Zhou, G., Lampouras, G.: WebNLG challenge 2020: language agnostic delexicalisation for multilingual RDF-to-text generation. In: Proceedings of the 3rd International Workshop on Natural Language Generation from the Semantic Web (WebNLG+), Dublin, Ireland (Virtual), pp. 186–191. Association for Computational Linguistics, December 2020. https://aclanthology.org/2020.webnlg-1.22

Download references

Acknowledgements

Projects Regione Lazio-DTC/“SanLo” (CUP F85F21001090003) and MISE (FSC 2014–2020)/“BARIUM5G” (CUP D94I20000160002) partially supported this work.

Author information

Authors and Affiliations

Politecnico di Bari, Bari, Italy
Simona Colucci, Nicola Iurilli & Eugenio Di Sciascio
Universitá degli Studi della Tuscia, Viterbo, Italy
Francesco M. Donini

Authors

Simona Colucci
View author publications
You can also search for this author in PubMed Google Scholar
Francesco M. Donini
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Iurilli
View author publications
You can also search for this author in PubMed Google Scholar
Eugenio Di Sciascio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simona Colucci .

Editor information

Editors and Affiliations

National Research University, Higher School of Economics, Nizhny Novgorod, Russia
Eduard Babkin
San Jose State University, San Jose, CA, USA
Joseph Barjis
National Research University, Higher School of Economics, Nizhny Novgorod, Russia
Pavel Malyzhenkov
Czech Technical University in Prague, Prague, Czech Republic
Vojtěch Merunka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Colucci, S., Donini, F.M., Iurilli, N., Di Sciascio, E. (2022). A Business Intelligence Tool for Explaining Similarity. In: Babkin, E., Barjis, J., Malyzhenkov, P., Merunka, V. (eds) Model-Driven Organizational and Business Agility. MOBA 2022. Lecture Notes in Business Information Processing, vol 457. Springer, Cham. https://doi.org/10.1007/978-3-031-17728-6_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-17728-6_5
Published: 01 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17727-9
Online ISBN: 978-3-031-17728-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Business Intelligence Tool for Explaining Similarity