Multidimensional Information System Metadata Description Using the “Data Vault” Methodology

Fomin, Maxim

doi:10.1007/978-3-031-30648-8_2

Maxim Fomin ORCID: orcid.org/0000-0002-7924-9743⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1748))

Included in the following conference series:

International Conference on Distributed Computer and Communication Networks

236 Accesses

Abstract

The task of determining the metadata of a multidimensional information system corresponds to the description of the parameters of cells that contain information about the facts that are included in the multidimensional data cube. Classification schemes can be used when constructing metadata. The classification scheme corresponds to certain structural component of the observed phenomenon. The cell parameters are presented in the classification scheme in a hierarchical form and are combined in metadata when connecting several classification schemes. To construct a hierarchy of elements of the classification scheme, it is necessary to identify groups of members for which there is a semantic connection with groups of members of other dimensions. Cartesian product can be applied to groups of members. As a result, clusters of members’ combinations will be formed in the metadata. The complete metadata structure can be achieved by combining all clusters. In case of a large amount of aspects of analysis, a multidimensional data cube has specific properties related to sparsity. The use of classification schemes makes it possible to identify parts in the metadata that correspond to individual structural components of the observed phenomenon. If a multidimensional data cube is constructed in the process of automated data collection, the “Data vault” methodology can be used to describe the metadata. This method allows you to reflect the relationships between business objects in the metadata.

This paper has been supported by the RUDN University Strategic Academic Leadership Program.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abello, A., et al.: Using semantic web technologies for exploratory OLAP: a survey. IEEE Trans. Knowl. Data Eng. 27(2), 571–588 (2015). https://doi.org/10.1109/TKDE.2014.2330822
Article Google Scholar
de Castro Lima, J., Hirata, C.M.: Multidimensional cyclic graph approach: representing a data cube without common sub-graphs. Inf. Sci. 181(13), 2626–2655 (2011). https://doi.org/10.1016/j.ins.2010.05.012
Article Google Scholar
Chen, C., Feng, J., Xiang, L.: Computation of sparse data cubes with constraints. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 14–23. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45228-7_3
Chapter Google Scholar
Chun, S.-J.: Partial prefix sum method for large data warehouses. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 473–477. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39592-8_67
Chapter Google Scholar
Cuzzocrea, A.: OLAP data cube compression techniques: a ten-year-long history. In: Kim, Th., Lee, Yh., Kang, BH., Slezak, D. (eds.) FGIT 2010. LNCS, vol. 6485, pp. 751–754. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17569-5_74
Fomin, M.: Cluster method of description of information system data model based on multidimensional approach. In: Vishnevskiy, V.M., Samouylov, K.E., Kozyrev, D.V. (eds.) DCCN 2016. CCIS, vol. 678, pp. 657–668. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-51917-3_56
Chapter Google Scholar
Fomin, M.: The application of classification schemes while describing metadata of the multidimensional information system based on the cluster method. In: Vishnevskiy, V.M., Samouylov, K.E., Kozyrev, D.V. (eds.) DCCN 2017. CCIS, vol. 700, pp. 307–318. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66836-9_26
Chapter Google Scholar
Fu, L.: Efficient evaluation of sparse data cubes. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 336–345. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27772-9_34
Chapter Google Scholar
Gautam, V., Parimala, N.: E-metadata versioning system for data warehouse schema. Int. J. Metadata Semant. Ontol. 7(2), 101–113 (2012). https://doi.org/10.1504/IJMSO.2012.050015
Article Google Scholar
Giebler, C., Gröger, C., Hoos, E., Schwarz, H., Mitschang, B.: Modeling data lakes with data vault: practical experiences, assessment, and lessons learned. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 63–77. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_7
Chapter Google Scholar
Goil, S., Choudhary, A.: Design and implementation of a scalable parallel system for multidimensional analysis and OLAP. In: Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing (IPPS/SPDP 1999), pp. 576–581 (1999). https://doi.org/10.1109/IPPS.1999.760535
Gómez, L.I., Gómez, S.A., Vaisman, A.A.: A generic data model and query language for spatiotemporal OLAP cube analysis. In: Proceedings of the 15th International Conference on Extending Database Technology (EDBT 2012), pp. 300–311. Association for Computing Machinery, New York, USA (2012). https://doi.org/10.1145/2247596.2247632
Inmon, W., Linstedt, D., Levins, M.: Introduction to data vault architecture. In: Inmon, W., Linstedt, D., Levins, M. (eds.) Data Architecture, 2nd edn., pp. 157–162. Academic Press (2019). https://doi.org/10.1016/B978-0-12-816916-2.00020-6
Jin, R., Vaidyanathan, J., Yang, G., Agrawal, G.: Communication and memory optimal parallel data cube construction. IEEE Trans. Parallel Distrib. Syst. 16, 1105–1119 (2005)
Article Google Scholar
Karayannidis, N., Sellis, T., Kouvaras, Y.: CUBE file: a file structure for hierarchically clustered OLAP cubes. In: Bertino, E., et al. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 621–638. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24741-8_36
Chapter Google Scholar
Leonhardi, B., Mitschang, B., Pulido, R., Sieb, C., Wurst, M.: Augmenting OLAP exploration with dynamic advanced analytics. In: Proceedings of the 13th International Conference on Extending Database Technology (EDBT 2010), pp. 687–692. Association for Computing Machinery, New York, USA (2010). https://doi.org/10.1145/1739041.1739127
Luo, Z.W., Ling, T.W., Ang, C.H., Lee, S.Y., Cui, B.: Range top/bottom k queries in OLAP sparse data cubes. In: Mayr, H.C., Lazansky, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 678–687. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44759-8_66
Chapter Google Scholar
Messaoud, R., Boussaid, O., Loudcher, S.: A multiple correspondence analysis to organize data cubes. Front. Artif. Intell. Appl. 155, 133–146 (2007)
Google Scholar
Puonti, M., Raitalaakso, T., Aho, T., Mikkonen, T.: Automating transformations in data vault data warehouse loads. Front. Artif. Intell. Appl. 292, 215–230 (2017). https://doi.org/10.3233/978-1-61499-720-7-215
Article Google Scholar
Salmam, F.Z., Fakir, M., Errattahi, R.: Prediction in OLAP data cubes. J. Inf. Knowl. Manag. 15(02), 449–458 (2016). https://doi.org/10.1142/S0219649216500222
Article Google Scholar
Schneider, S., Frosch-Wilke, D.: Analysis patterns in dimensional data modeling. In: Kannan, R., Andres, F. (eds.) ICDEM 2010. LNCS, vol. 6411, pp. 109–116. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27872-3_17
Chapter Google Scholar
Thomsen, E.: OLAP Solution: Building Multidimensional Information System. Willey, New York (2002)
Google Scholar
Tsai, M.-F., Chu, W.: A multidimensional aggregation object (MAO) framework for computing distributive aggregations. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 45–54. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45228-7_6
Chapter Google Scholar
Vitter, J.S., Wang, M.: Approximate computation of multidimensional aggregates of sparse data using wavelets. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data (SIGMOD 1999), pp. 193–204. Association for Computing Machinery, New York, USA (1999). https://doi.org/10.1145/304182.304199
Wang, W., Lu, H., Feng, J., Yu, J.: Condensed cube: an effective approach to reducing data cube size. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), pp. 155–165. IEEE Computer Society, Washington (2002)
Google Scholar
Yessad, L., Labiod, A.: Comparative study of data warehouses modeling approaches: Inmon, Kimball and data vault. In: 2016 International Conference on System Reliability and Science (ICSRS), pp. 95–99 (2016). https://doi.org/10.1109/ICSRS.2016.7815845
Zhang, R., Pan, D.: Metadata management based on lifecycle for DW 2.0. In: 2010 8th World Congress on Intelligent Control and Automation, pp. 5154–5157 (2010). https://doi.org/10.1109/WCICA.2010.5554915

Download references

Author information

Authors and Affiliations

Peoples’ Friendship University of Russia (RUDN University), Miklukho-Maklaya st. 6, Moscow, 117198, Russian Federation
Maxim Fomin

Authors

Maxim Fomin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maxim Fomin .

Editor information

Editors and Affiliations

V. A. Trapeznikov Institute of Control Sciences of RAS, Moscow, Russia
Vladimir M. Vishnevskiy
RUDN University, Moscow, Russia
Konstantin E. Samouylov
V. A. Trapeznikov Institute of Control Sciences of RAS, Moscow, Russia
Dmitry V. Kozyrev

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fomin, M. (2023). Multidimensional Information System Metadata Description Using the “Data Vault” Methodology. In: Vishnevskiy, V.M., Samouylov, K.E., Kozyrev, D.V. (eds) Distributed Computer and Communication Networks. DCCN 2022. Communications in Computer and Information Science, vol 1748. Springer, Cham. https://doi.org/10.1007/978-3-031-30648-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-30648-8_2
Published: 01 May 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30647-1
Online ISBN: 978-3-031-30648-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multidimensional Information System Metadata Description Using the “Data Vault” Methodology