Abstract
We present a framework and methodology to benchmark NoSQL stores for large scale model persistence. NoSQL technologies potentially improve performance of some applications and provide schema-less data-structures, so are particularly suited to persisting large and heterogeneous models. Recent studies consider only a narrow set of NoSQL stores for large scale modelling. Benchmarking many technologies requires substantial effort due to the disparate interface each store provides. Our experiments compare a broad range of NoSQL stores in terms of processor time and disc space used. The framework and methodology is evaluated through a case study that involves persisting large reverse-engineered models of open source projects. The results give tool engineers and practitioners a basis for selecting a store to persist large models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Steinberg, D., Budinsky, F., Merks, E., Paternostro, M.: EMF: Eclipse modeling framework. Pearson Education (2008)
Kolovos, D.S., Rose, L.M., Matragkas, N., Paige, R.F., Guerra, E., Cuadrado, J.S., De Lara, J., Ráth, I., Varró, D., Tisi, M., Cabot, J.: A Research Roadmap Towards Achieving Scalability in Model Driven Engineering. In: Proceedings of the Workshop on Scalability in Model Driven Engineering, BigMDE 2013, pp. 2:1–2:10. ACM, New York (2013)
Barmpis, K., Kolovos, D.S.: Evaluation of Contemporary Graph Databases for Efficient Persistence of Large-Scale Models. Journal of Object Technology (to appear, 2014)
Espinazo Pagán, J., Sánchez Cuadrado, J., GarcÃa Molina, J.: Morsa: A Scalable Approach for Persisting and Accessing Large Models. In: Whittle, J., Clark, T., Kühne, T. (eds.) MODELS 2011. LNCS, vol. 6981, pp. 77–92. Springer, Heidelberg (2011)
Fitzpatrick, B.: Distributed caching with memcached. Linux Journal 2004(124), 5 (2004)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: Amazon’s highly available key-value store. SIGOPS Oper. Syst. Rev. 41(6), 205–220 (2007)
Fink, B.: Distributed computation on dynamo-style distributed storage: Riak pipe. In: Hoffman, T., Hughes, J. (eds.) Erlang Workshop, pp. 43–50. ACM (2012)
Fuchs, A.: Accumulo–Extensions to Google’s Bigtable Design (2012)
Auradkar, A., Botev, C., Das, S., De Maagd, D., Feinberg, A., Ganti, P., Gao, L., Ghosh, B., Gopalakrishna, K., Harris, B., Koshy, J., Krawez, K., Kreps, J., Lu, S., Nagaraj, S., Narkhede, N., Pachev, S., Perisic, I., Qiao, L., Quiggle, T., Rao, J., Schulman, B., Sebastian, A., Seeliger, O., Silberstein, A., Shkolnik, B., Soman, C., Sumbaly, R., Surlaker, K., Topiwala, S., Tran, C., Varadarajan, B., Westerman, J., White, Z., Zhang, D., Zhang, J.: Data Infrastructure at LinkedIn. In: 2012 IEEE 28th International Conference on Data Engineering (ICDE), pp. 1370–1381 (April 2012)
Chodorow, K., Dirolf, M.: MongoDB - The Definitive Guide: Powerful and Scalable Data Storage. O’Reilly (2010)
Brown, M.C.: Getting Started with CouchDB - Extreme Scalability at Your Fingertips. O’Reilly (2012)
ArangoDB, https://www.arangodb.org
Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2006 (2006)
Lakshman, A., Malik, P.: Cassandra: A decentralized structured storage system. Operating Systems Review 44(2), 35–40 (2010)
George, L.: HBase: The Definitive Guide, 1st edn. O’Reilly Media (2011)
Webber, J.: A programmatic introduction to Neo4j. In: Leavens, G.T. (ed.) SPLASH, pp. 217–218. ACM (2012)
OrientDB, http://www.orientechnologies.com/orientdb .
TitanDB, http://thinkaurelius.github.io/titan
Kuhlmann, M., Hamann, L., Gogolla, M., Büttner, F.: A benchmark for OCL engine accuracy, determinateness, and efficiency. Software and System Modeling 11(2), 165–182 (2012)
Bergmann, G., Ujhelyi, Z., Ráth, I., Varró, D.: A Graph Query Language for EMF Models. In: Cabot, J., Visser, E. (eds.) ICMT 2011. LNCS, vol. 6707, pp. 167–182. Springer, Heidelberg (2011)
Varró, G., Schürr, A., Varró, D.: Benchmarking for Graph Transformation. In: VL/HCC, pp. 79–88 (2005)
Barmpis, K., Kolovos, D.S.: Comparative Analysis of Data Persistence Technologies for Large-Scale Models. In: XM@MoDELS (2012)
(CDO): Connected Data Objects, http://www.eclipse.org/cdo/documentation/index.php
Paige, R.F., Kolovos, D.S., Rose, L.M., Drivalos, N., Polack, F.A.C.: The Design of a Conceptual Framework and Technical Infrastructure for Model Management Language Engineering. In: Proc. 14th IEEE International Conference on Engineering of Complex Computer Systems, Potsdam, Germany (2009)
MongoEMF, https://github.com/BryanHunt/mongo-emf
Neo4EMF, http://neo4emf.com
MySQL: http://www.mysql.com/.
ObjectivityDB, http://www.objectivity.com/products/objectivitydb
Scheidgen, M., Zubow, A., Fischer, J., Kolbe, T.H.: Automated and transparent model fragmentation for persisting large models. In: France, R.B., Kazmeier, J., Breu, R., Atkinson, C. (eds.) MODELS 2012. LNCS, vol. 7590, pp. 102–118. Springer, Heidelberg (2012)
Barmpis, K., Kolovos, D.: Hawk: Towards a scalable model indexing architecture. In: Proceedings of the Workshop on Scalability in Model Driven Engineering, BigMDE 2013, pp. 6:1–6:9. ACM, New York (2013)
Cooper, B.F., Silberstein, A., Tam, E., Ramakrishnan, R., Sears, R.: Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp. 143–154. ACM (2010)
Bruneliere, H., Cabot, J., Jouault, F., Madiot, F.: MoDisco: A generic and extensible framework for model driven reverse engineering. In: Proceedings of the IEEE/ACM International Conference on Automated Software Engineering, pp. 173–174. ACM (2010)
Ait-Ameur, Y., Besnard, F., Girard, P., Pierra, G., Potier, J.C.: Formal specification and metaprogramming in the EXPRESS language. In: Intern. Conference on Software Engineering and Knowledge Engineering SEKE, vol. 95, pp. 181–189 (1995)
TinkerPop: Blueprints, https://github.com/tinkerpop/blueprints/wiki
Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A generic architecture for storing and querying rdf and rdf schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)
AccumuloDB, https://accumulo.apache.org/
FoundationDB, https://foundationdb.com/
Seltzer, M.: Oracle nosql database. Oracle White Paper (2011)
Brewer, E.A.: Towards robust distributed systems. In: PODC (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Shah, S.M., Wei, R., Kolovos, D.S., Rose, L.M., Paige, R.F., Barmpis, K. (2014). A Framework to Benchmark NoSQL Data Stores for Large-Scale Model Persistence. In: Dingel, J., Schulte, W., Ramos, I., Abrahão, S., Insfran, E. (eds) Model-Driven Engineering Languages and Systems. MODELS 2014. Lecture Notes in Computer Science, vol 8767. Springer, Cham. https://doi.org/10.1007/978-3-319-11653-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-11653-2_36
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11652-5
Online ISBN: 978-3-319-11653-2
eBook Packages: Computer ScienceComputer Science (R0)