Skip to main content

A Framework to Benchmark NoSQL Data Stores for Large-Scale Model Persistence

  • Conference paper
Model-Driven Engineering Languages and Systems (MODELS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8767))

Abstract

We present a framework and methodology to benchmark NoSQL stores for large scale model persistence. NoSQL technologies potentially improve performance of some applications and provide schema-less data-structures, so are particularly suited to persisting large and heterogeneous models. Recent studies consider only a narrow set of NoSQL stores for large scale modelling. Benchmarking many technologies requires substantial effort due to the disparate interface each store provides. Our experiments compare a broad range of NoSQL stores in terms of processor time and disc space used. The framework and methodology is evaluated through a case study that involves persisting large reverse-engineered models of open source projects. The results give tool engineers and practitioners a basis for selecting a store to persist large models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Steinberg, D., Budinsky, F., Merks, E., Paternostro, M.: EMF: Eclipse modeling framework. Pearson Education (2008)

    Google Scholar 

  2. Kolovos, D.S., Rose, L.M., Matragkas, N., Paige, R.F., Guerra, E., Cuadrado, J.S., De Lara, J., Ráth, I., Varró, D., Tisi, M., Cabot, J.: A Research Roadmap Towards Achieving Scalability in Model Driven Engineering. In: Proceedings of the Workshop on Scalability in Model Driven Engineering, BigMDE 2013, pp. 2:1–2:10. ACM, New York (2013)

    Google Scholar 

  3. Barmpis, K., Kolovos, D.S.: Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models. Journal of Object Technology (to appear, 2014)

    Google Scholar 

  4. Espinazo Pagán, J., Sánchez Cuadrado, J., García Molina, J.: Morsa: A Scalable Approach for Persisting and Accessing Large Models. In: Whittle, J., Clark, T., Kühne, T. (eds.) MODELS 2011. LNCS, vol. 6981, pp. 77–92. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  5. Fitzpatrick, B.: Distributed caching with memcached. Linux Journal 2004(124), 5 (2004)

    Google Scholar 

  6. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)

    Article  Google Scholar 

  7. Fink, B.: Distributed computation on dynamo-style distributed storage: Riak pipe. In: Hoffman, T., Hughes, J. (eds.) Erlang Workshop, pp. 43–50. ACM (2012)

    Google Scholar 

  8. Fuchs, A.: Accumulo–Extensions to Google’s Bigtable Design (2012)

    Google Scholar 

  9. Auradkar, A., Botev, C., Das, S., De Maagd, D., Feinberg, A., Ganti, P., Gao, L., Ghosh, B., Gopalakrishna, K., Harris, B., Koshy, J., Krawez, K., Kreps, J., Lu, S., Nagaraj, S., Narkhede, N., Pachev, S., Perisic, I., Qiao, L., Quiggle, T., Rao, J., Schulman, B., Sebastian, A., Seeliger, O., Silberstein, A., Shkolnik, B., Soman, C., Sumbaly, R., Surlaker, K., Topiwala, S., Tran, C., Varadarajan, B., Westerman, J., White, Z., Zhang, D., Zhang, J.: Data Infrastructure at LinkedIn. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 1370–1381 (April 2012)

    Google Scholar 

  10. Chodorow, K., Dirolf, M.: MongoDB - The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly (2010)

    Google Scholar 

  11. Brown, M.C.: Getting Started with CouchDB - Extreme Scalability at Your Fingertips. O’Reilly (2012)

    Google Scholar 

  12. ArangoDB, https://www.arangodb.org

  13. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2006 (2006)

    Google Scholar 

  14. Lakshman, A., Malik, P.: Cassandra: A decentralized structured storage system. Operating Systems Review 44(2), 35–40 (2010)

    Article  Google Scholar 

  15. George, L.: HBase: The Definitive Guide, 1st edn. O’Reilly Media (2011)

    Google Scholar 

  16. Webber, J.: A programmatic introduction to Neo4j. In: Leavens, G.T. (ed.) SPLASH, pp. 217–218. ACM (2012)

    Google Scholar 

  17. OrientDB, http://www.orientechnologies.com/orientdb .

  18. TitanDB, http://thinkaurelius.github.io/titan

  19. Kuhlmann, M., Hamann, L., Gogolla, M., Büttner, F.: A benchmark for OCL engine accuracy, determinateness, and efficiency. Software and System Modeling 11(2), 165–182 (2012)

    Article  Google Scholar 

  20. Bergmann, G., Ujhelyi, Z., Ráth, I., Varró, D.: A Graph Query Language for EMF Models. In: Cabot, J., Visser, E. (eds.) ICMT 2011. LNCS, vol. 6707, pp. 167–182. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  21. Varró, G., Schürr, A., Varró, D.: Benchmarking for Graph Transformation. In: VL/HCC, pp. 79–88 (2005)

    Google Scholar 

  22. Barmpis, K., Kolovos, D.S.: Comparative Analysis of Data Persistence Technologies for Large-Scale Models. In: XM@MoDELS (2012)

    Google Scholar 

  23. (CDO): Connected Data Objects, http://www.eclipse.org/cdo/documentation/index.php

  24. Paige, R.F., Kolovos, D.S., Rose, L.M., Drivalos, N., Polack, F.A.C.: The Design of a Conceptual Framework and Technical Infrastructure for Model Management Language Engineering. In: Proc. 14th IEEE International Conference on Engineering of Complex Computer Systems, Potsdam, Germany (2009)

    Google Scholar 

  25. MongoEMF, https://github.com/BryanHunt/mongo-emf

  26. Neo4EMF, http://neo4emf.com

  27. MySQL: http://www.mysql.com/.

  28. ObjectivityDB, http://www.objectivity.com/products/objectivitydb

  29. Scheidgen, M., Zubow, A., Fischer, J., Kolbe, T.H.: Automated and transparent model fragmentation for persisting large models. In: France, R.B., Kazmeier, J., Breu, R., Atkinson, C. (eds.) MODELS 2012. LNCS, vol. 7590, pp. 102–118. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  30. Barmpis, K., Kolovos, D.: Hawk: Towards a scalable model indexing architecture. In: Proceedings of the Workshop on Scalability in Model Driven Engineering, BigMDE 2013, pp. 6:1–6:9. ACM, New York (2013)

    Google Scholar 

  31. Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154. ACM (2010)

    Google Scholar 

  32. Bruneliere, H., Cabot, J., Jouault, F., Madiot, F.: MoDisco: A generic and extensible framework for model driven reverse engineering. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 173–174. ACM (2010)

    Google Scholar 

  33. Ait-Ameur, Y., Besnard, F., Girard, P., Pierra, G., Potier, J.C.: Formal specification and metaprogramming in the EXPRESS language. In: Intern. Conference on Software Engineering and Knowledge Engineering SEKE, vol. 95, pp. 181–189 (1995)

    Google Scholar 

  34. TinkerPop: Blueprints, https://github.com/tinkerpop/blueprints/wiki

  35. Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A generic architecture for storing and querying rdf and rdf schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  36. SparkSee, http://www.sparsity-technologies.com/#sparksee

  37. AccumuloDB, https://accumulo.apache.org/

  38. FoundationDB, https://foundationdb.com/

  39. Seltzer, M.: Oracle nosql database. Oracle White Paper (2011)

    Google Scholar 

  40. Brewer, E.A.: Towards robust distributed systems. In: PODC (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Shah, S.M., Wei, R., Kolovos, D.S., Rose, L.M., Paige, R.F., Barmpis, K. (2014). A Framework to Benchmark NoSQL Data Stores for Large-Scale Model Persistence. In: Dingel, J., Schulte, W., Ramos, I., Abrahão, S., Insfran, E. (eds) Model-Driven Engineering Languages and Systems. MODELS 2014. Lecture Notes in Computer Science, vol 8767. Springer, Cham. https://doi.org/10.1007/978-3-319-11653-2_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11653-2_36

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11652-5

  • Online ISBN: 978-3-319-11653-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics