skip to main content
research-article

An Evaluation of Graph Databases and Object-Graph Mappers in CIDOC CRM-Compliant Digital Archives

Published:16 September 2022Publication History
Skip Abstract Section

Abstract

The Portuguese General Directorate for Book, Archives and Libraries (DGLAB) has selected CIDOC CRM as the basis for its next-generation digital archive management software. Given the ontological foundations of the Conceptual Reference Model (CRM), a graph database or a triplestore was seen as the best candidate to represent a CRM-based data model for the new software. We thus decided to compare several of these databases, based on their maturity, features, performance in standard tasks and, most importantly, the Object-Graph Mappers (OGM) available to interact with each database in an object-oriented way. Our conclusions are drawn not only from a systematic review of related works but from an experimental scenario. For our experiment, we designed a simple CRM-compliant graph designed to test the ability of each OGM/database combination to tackle the so-called “diamond-problem” in Object-Oriented Programming (OOP) to ensure that property instances follow domain and range constraints.

  Our results show that (1) ontological consistency enforcement in graph databases and triplestores is much harder to achieve than in a relational database, making them more suited to an analytical rather than a transactional role; (2) OGMs are still rather immature solutions; and (3) neomodel, an OGM for the Neo4j graph database, is the most mature solution in the study as it satisfies all requirements, although it is also the least performing.

REFERENCES

  1. [1] Ameya Nayak, Anil Poriya, and Dikshay Poojary. 2013. Type of NOSQL databases and its comparison with relational databases. International Journal of Applied Information Systems 5, January 2013 (2013), 1619.Google ScholarGoogle Scholar
  2. [2] Angles Renzo. 2012. A comparison of current graph database models. In Proceedings of the IEEE 28th International Conference on Data Engineering Workshops (ICDEW’12). IEEE, 171177. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. [3] ArangoDB. 2020. ArangoDB. ArangoDB. Retrieved February 28, 2022 from https://www.arangodb.com/.Google ScholarGoogle Scholar
  4. [4] Auer S***ren and Herre Heinrich. 2007. A versioning and evolution framework for RDF knowledge bases. In Perspectives of Systems Informatics, Virbitskaite Irina and Voronkov Andrei (Eds.). Springer, Berlin,5569.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. [5] Berners-Lee Tim, Hendler James, and Lassila Ora. 2001. The semantic web. Scientific American 284, 5 (2001), 3443.Google ScholarGoogle ScholarCross RefCross Ref
  6. [6] Broekstra Jeen and Kampman Arjohn. 2003. Inferencing and truth maintenance in RDF schema. In PSSS1 - Practical and Scalable Semantic Systems, Proceedings of the First International Workshop on Practical and Scalable Semantic Systems, Sanibel Island, Florida, October 20, 2003(CEUR Workshop Proceedings, Vol. 89), Volz Raphael, Decker Stefan, and Cruz Isabel F. (Eds.). CEUR-WS.org. http://ceur-ws.org/Vol-89/broekstra-et-al.pdf.Google ScholarGoogle Scholar
  7. [7] Can Ozgu, Sezer Emine, Bursa Okan, and Unalir Murat Osman. 2017. Comparing relational and ontological triple stores in healthcare domain. Entropy 19, 1 (2017), 30.Google ScholarGoogle ScholarCross RefCross Ref
  8. [8] Cassidy Steve and Ballantine James. 2007. Version control for RDF triple stores. In Proceedings of the 2nd International Conference on Software and Data Technologies (ICSOFT’07) ISDM, WSEHS (2007), 512.Google ScholarGoogle Scholar
  9. [9] Consortium World Wide Web. 2017. Shapes Constraint Language (SHACL). World Wide Web Consortium. Retrieved February 28, 2022 from https://www.w3.org/TR/shacl/.Google ScholarGoogle Scholar
  10. [10] Consortium World Wide Web. 2020. Large Triple Stores. World Wide Web Consortium. Retrieved February 28, 2022 from https://www.w3.org/wiki/LargeTripleStores.Google ScholarGoogle Scholar
  11. [11] Dietze Felix, Karoff Johannes, Valdez André Calero, Ziefle Martina, Greven Christoph, and Schroeder Ulrik. 2016. An open-source object-graph-mapping framework for Neo4j and Scala: Renesca. In Availability, Reliability, and Security in Information Systems, Buccafurri Francesco, Holzinger Andreas, Kieseberg Peter, Tjoa A. Min, and Weippl Edgar (Eds.). Springer International Publishing, Cham, 204218.Google ScholarGoogle Scholar
  12. [12] Dietze Felix, Karoff Johannes, Valdez André Calero, Ziefle Martina, Greven Christoph, and Schroeder Ulrik. 2016. An open-source object-graph-mapping framework for Neo4j and Scala: Renesca. In Availability, Reliability, and Security in Information Systems - IFIP WG 8.4, 8.9, TC 5 International Cross-Domain Conference, CD-ARES 2016, and Workshop on Privacy Aware Machine Learning for Health Data Science, PAML 2016, Salzburg, Austria, August 31 - September 2, 2016, Proceedings(Lecture Notes in Computer Science, Vol. 9817), Buccafurri Francesco, Holzinger Andreas, Kieseberg Peter, Tjoa A. Min, and Weippl Edgar R. (Eds.). Springer, 204218. Google ScholarGoogle ScholarCross RefCross Ref
  13. [13] Dominguez-Sal David, Martinez-Bazan Norbert, Muntes-Mulero Victor, Baleta Pere, and Larriba-Pey Josep Lluis. 2011. A discussion on the design of graph database benchmarks. In Performance Evaluation, Measurement and Characterization of Complex Systems, Nambiar Raghunath and Poess Meikel (Eds.). Springer, Berlin, 2540.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. [14] Ducournau Roland and Privat Jean. 2011. Metamodeling semantics of multiple inheritance. Science of Computer Programming 76, 7 (2011), 555586.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. [15] Fernandes Diogo and Bernardino Jorge. 2018. Graph databases comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4j, and OrientDB. In Proceedings of the 7th International Conference on Data Science, Technology and Applications (DATA’18), Porto, Portugal, July 26-28, 2018, Bernardino Jorge, and Quix Christoph (Eds.). SciTePress, 373380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. [16] Fernandes Diogo and Bernardino Jorge. 2018. Graph databases comparison: AllegroGraph, ArangoDB, InfiniteGraph, Neo4j, and OrientDB. In Proceedings of the 7th International Conference on Data Science, Technology and Applications (Porto, Portugal) (DATA’18). SciTePress —Science and Technology Publications, Lda, Setubal, PRT, 373380. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. [17] Fischer Peter M., Lausen Georg, Schätzle Alexander, and Schmidt Michael. 2015. RDF constraint checking. CEUR Workshop Proceedings 1330 (2015), 205212.Google ScholarGoogle Scholar
  18. [18] Grove David, DeFouw Greg, Dean Jeffrey, and Chambers Craig. 1997. Call graph construction in object-oriented languages. Proceedings of the Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA) 32, 10 (1997), 108124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. [19] Horridge Matthew and Bechhofer Sean. 2011. The OWL API: A Java API for OWL ontologies. Semantic Web 2, 1 (2011), 1121. Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Huang Jiewen, Abadi Daniel J., and Ren Kun. 2011. Scalable SPARQL querying of large RDF graphs. Proceedings of the VLDB Endowment 4, 11 (Aug. 2011), 11231134. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Hyvönen Eero, Heino Erkki, Leskinen Petri, Ikkala Esko, Koho Mikko, Tamper Minna, Tuominen Jouni, and Mäkelä Eetu. 2016. WarSampo data service and semantic portal for publishing linked open data about the Second World War history. In European Semantic Web Conference. Springer, 758773.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. [22] Manish Jain and Dgraph Labs. 2020. Dgraph: Synchronously replicated, transactional and distributed graph database.Version: 0.8 Retrieved on February 23, 2020 from https://dogy.io/wp-content/uploads/2021/04/dgraph.pdf.Google ScholarGoogle Scholar
  23. [23] Jatana Nishtha, Puri Sahil, Ahuja Mehak, Kathuria Ishita, and Gosain Dishant. 2012. A survey and comparison of relational and non-relational database. International Journal of Engineering Research & Technology 1, 6 (2012), 15.Google ScholarGoogle Scholar
  24. [24] Jouili Salim and Vansteenberghe Valentin. 2013. An empirical comparison of graph databases. In International Conference on Social Computing (SocialCom’13), SocialCom/PASSAT/BigData/EconCom/BioMedCom 2013, Washington, DC, September 8-14, 2013. IEEE,708715. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. [25] Koch Inês, Freitas Nuno, Ribeiro Cristina, Lopes Carla Teixeira, and Silva João Rocha da. 2019. Knowledge graph implementation of archival descriptions through CIDOC-CRM. In Digital Libraries for Open Knowledge, Doucet Antoine, Isaac Antoine, Golub Koraljka, Aalberg Trond, and Jatowt Adam (Eds.). Springer International Publishing, Cham, 99106. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. [26] Koch Inês, Ribeiro Cristina, and Lopes Carla Teixeira. 2020. ArchOnto, a CIDOC-CRM-based linked data model for the Portuguese archives. In Proceedings of the 24th International Conference on Theory and Practice of Digital Libraries, Hall Mark, Merčun Tanja, Risse Thomas, and Duchateau Fabien (Eds.). Springer International Publishing, Cham, 133146. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. [27] Konstantinou Ioannis, Angelou Evangelos, Boumpouka Christina, Tsoumakos Dimitrios, and Koziris Nectarios. 2011. On the elasticity of NoSQL databases over cloud management platforms. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management. 23852388.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. [28] Lampe Karl-Heinz, Krause Sigfried, and Doerr Martin. 2010. The CIDOC conceptual reference model (CIDOC-CRM): PRIMER. CIDOC-CRM Official Web Site 53 (2010), 333338. http://www.cidoc-crm.org/.Google ScholarGoogle Scholar
  29. [29] Leavitt Neal. 2010. Will NoSQL databases live up to their promise? Computer 43, 2 (2010), 1214.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. [30] Maillot Pierre, Raimbault Thomas, Genest David, and Loiseau Stéphane. 2014. Consistency evaluation of RDF data: How data and updates are relevant. In 10th International Conference on Signal-Image Technology and Internet-Based Systems, SITIS 2014, Marrakech, Morocco, November 23-27, 2014, Yétongnon Kokou, Dipanda Albert, and Chbeir Richard (Eds.). IEEE, 187193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. [31] McColl Robert Campbell, Ediger David, Poovey Jason, Campbell Dan, and Bader David A.. 2014. A performance evaluation of open source graph databases. In Proceedings of the 1st Workshop on Parallel Programming for Analytics Applications (PPAA 14), Orlando, Florida, February 16, 2014, Kumar Manoj, Jann Joefon, and Nagpurkar Priya (Eds.). ACM, 1118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. [32] Mendel-Gleason Gavin, Feeney Kevin, and Brennan Rob. 2015. Ontology consistency and instance checking for real world linked data. In Proceedings of the 2nd Workshop on Linked Data Quality co-located with 12th Extended Semantic Web Conference (ESWC’15), Portorož, Slovenia, June 1, 2015(CEUR Workshop Proceedings, Vol. 1376), Rula Anisa, Zaveri Amrapali, Knuth Magnus, and Kontokostas Dimitris (Eds.). CEUR-WS.org. http://ceur-ws.org/Vol-1376/LDQ2015_paper_03.pdf.Google ScholarGoogle Scholar
  33. [33] Miller Justin J.. 2013. Graph database applications and concepts with Neo4j. In Proceedings of the Southern Association for Information Systems Conference, Atlanta, GA, Vol. 2324. https://aisel.aisnet.org/sais2013/24/?utm_source=aisel.aisnet.org%2Fsais2013%2F24&utm_medium=PDF&utm_campaign=PDFCoverPages.Google ScholarGoogle Scholar
  34. [34] Moniruzzaman A. B. M. and Hossain Syed Akhter. 2013. NoSQL database: New era of databases for big data analytics — classification, characteristics and comparison. CoRR abs/1307.0191. (2013). arxiv:1307.0191. http://arxiv.org/abs/1307.0191.Google ScholarGoogle Scholar
  35. [35] Neo4j. 2020. Rdf Triple Stores vs. Labeled Property Graphs: What’s the Difference? Retrieved March 1, 2022 from https://neo4j.com/blog/rdf-triple-store-vs-labeled-property-graph-difference/.Google ScholarGoogle Scholar
  36. [36] Obay Mohamed A. Mohamed, Altrafi G., and Ismail Mohammed O.. 2014. Relational vs. NoSQL databases : A survey. International Journal of Computer and Information Technology 03, 03 (2014), 22792764.Google ScholarGoogle Scholar
  37. [37] Ognyanov Damyan and Kiryakov Atanas. 2002. Tracking changes in RDF(S) repositories. In Knowledge Engineering and Knowledge Management: Ontologies and the Semantic Web, Gómez-Pérez Asunción and Benjamins V. Richard (Eds.). Springer, Berlin, 373378.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Oldman Dominic and Tanase Diana. 2018. Reshaping the knowledge graph by connecting researchers, data and practices in researchspace. In The Semantic Web – ISWC 2018, Vrandečić Denny, Bontcheva Kalina, Suárez-Figueroa Mari Carmen, Presutti Valentina, Celino Irene, Sabou Marta, Kaffee Lucie-Aimée, and Simperl Elena (Eds.). Springer International Publishing, Cham, 325340.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. [39] Orango. 2018. Orango. Retrieved March 1, 2022 from https://orango.js.org/.Google ScholarGoogle Scholar
  40. [40] Robinson Ian, Webber Jim, and Eifrem Emil. 2013. Graph Databases. O’Reilly Media, Inc.Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. [41] Roussakis Yannis, Chrysakis Ioannis, Stefanidis Kostas, Flouris Giorgos, and Stavrakas Yannis. 2015. A flexible framework for understanding the dynamics of evolving RDF datasets. In The Semantic Web - ISWC 2015, Arenas Marcelo, Corcho Oscar, Simperl Elena, Strohmaier Markus, d’Aquin Mathieu, Srinivas Kavitha, Groth Paul, Dumontier Michel, Heflin Jeff, Thirunarayan Krishnaprasad, Thirunarayan Krishnaprasad, and Staab Steffen (Eds.). Springer International Publishing, Cham, 495512.Google ScholarGoogle Scholar
  42. [42] Russom Philip. 2011. BIG DATA ANALYTICS - TDWI BEST PRACTICES REPORT introduction to big data analytics. TDWI Best Practices Report, Fourth Quarter 19, 4 (2011), 134. Retrieved March 1, 2022 from https://vivomente.com/wp-content/uploads/2016/04/big-data-analytics-white-paper.pdf.Google ScholarGoogle Scholar
  43. [43] Santos Maribel Yasmina and Ramos Isabel. 2006. Business Intelligence: Tecnologias da Informação na Gestão de Conhecimento. FCA-Editora de Informática, Lda.Google ScholarGoogle Scholar
  44. [44] Sharma Chandan and Sinha Roopak. 2019. A schema-first formalism for labeled property graph databases: Enabling structured data loading and analytics. In Proceedings of the 6th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (Auckland, New Zealand) (BDCAT’19). ACM, New York, NY, 7180. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. [45] Sirin Evren, Parsia Bijan, Grau Bernardo Cuenca, Kalyanpur Aditya, and Katz Yarden. 2007. Pellet: A practical OWL-DL reasoner. Journal of Web Semantics 5, 2 (2007), 5153. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. [46] Tsarkov Dmitry and Horrocks Ian. 2006. FaCT++ description logic reasoner: System description. In Automated Reasoning, Furbach Ulrich and Shankar Natarajan (Eds.). Springer, Berlin, 292297.Google ScholarGoogle Scholar
  47. [47] Bruggen Rik Van. 2014. Learning Neo4j. Packt Publishing Ltd.Google ScholarGoogle Scholar
  48. [48] Ruymbeke Muriel Van, Hallot Pierre, and Billen Roland. 2017. Enhancing CIDOC-CRM and compatible models with the concept of multiple interpretation. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2 (2017), 287294.Google ScholarGoogle ScholarCross RefCross Ref
  49. [49] Vicknair Chad, Macias Michael, Zhao Zhendong, Nan Xiaofei, Chen Yixin, and Wilkins Dawn. 2010. A comparison of a graph database and a relational database: A data provenance perspective. In Proceedings of the 48th Annual Southeast Regional Conference, 2010, Oxford, MS, April 15-17, 2010, Cunningham H. Conrad, Ruth Paul, and Kraft Nicholas A. (Eds.). ACM, 42. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An Evaluation of Graph Databases and Object-Graph Mappers in CIDOC CRM-Compliant Digital Archives

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image Journal on Computing and Cultural Heritage
          Journal on Computing and Cultural Heritage   Volume 15, Issue 3
          September 2022
          402 pages
          ISSN:1556-4673
          EISSN:1556-4711
          DOI:10.1145/3544006
          Issue’s Table of Contents

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 September 2022
          • Online AM: 18 February 2022
          • Accepted: 12 September 2021
          • Revised: 14 May 2021
          • Received: 15 November 2020
          Published in jocch Volume 15, Issue 3

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Refereed

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        View Full Text

        HTML Format

        View this article in HTML Format .

        View HTML Format