Abstract
The task of determining the metadata of a multidimensional information system corresponds to the description of the parameters of cells that contain information about the facts that are included in the multidimensional data cube. Classification schemes can be used when constructing metadata. The classification scheme corresponds to certain structural component of the observed phenomenon. The cell parameters are presented in the classification scheme in a hierarchical form and are combined in metadata when connecting several classification schemes. To construct a hierarchy of elements of the classification scheme, it is necessary to identify groups of members for which there is a semantic connection with groups of members of other dimensions. Cartesian product can be applied to groups of members. As a result, clusters of members’ combinations will be formed in the metadata. The complete metadata structure can be achieved by combining all clusters. In case of a large amount of aspects of analysis, a multidimensional data cube has specific properties related to sparsity. The use of classification schemes makes it possible to identify parts in the metadata that correspond to individual structural components of the observed phenomenon. If a multidimensional data cube is constructed in the process of automated data collection, the “Data vault” methodology can be used to describe the metadata. This method allows you to reflect the relationships between business objects in the metadata.
This paper has been supported by the RUDN University Strategic Academic Leadership Program.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abello, A., et al.: Using semantic web technologies for exploratory OLAP: a survey. IEEE Trans. Knowl. Data Eng. 27(2), 571–588 (2015). https://doi.org/10.1109/TKDE.2014.2330822
de Castro Lima, J., Hirata, C.M.: Multidimensional cyclic graph approach: representing a data cube without common sub-graphs. Inf. Sci. 181(13), 2626–2655 (2011). https://doi.org/10.1016/j.ins.2010.05.012
Chen, C., Feng, J., Xiang, L.: Computation of sparse data cubes with constraints. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 14–23. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45228-7_3
Chun, S.-J.: Partial prefix sum method for large data warehouses. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 473–477. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39592-8_67
Cuzzocrea, A.: OLAP data cube compression techniques: a ten-year-long history. In: Kim, Th., Lee, Yh., Kang, BH., Slezak, D. (eds.) FGIT 2010. LNCS, vol. 6485, pp. 751–754. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17569-5_74
Fomin, M.: Cluster method of description of information system data model based on multidimensional approach. In: Vishnevskiy, V.M., Samouylov, K.E., Kozyrev, D.V. (eds.) DCCN 2016. CCIS, vol. 678, pp. 657–668. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-51917-3_56
Fomin, M.: The application of classification schemes while describing metadata of the multidimensional information system based on the cluster method. In: Vishnevskiy, V.M., Samouylov, K.E., Kozyrev, D.V. (eds.) DCCN 2017. CCIS, vol. 700, pp. 307–318. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66836-9_26
Fu, L.: Efficient evaluation of sparse data cubes. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 336–345. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-27772-9_34
Gautam, V., Parimala, N.: E-metadata versioning system for data warehouse schema. Int. J. Metadata Semant. Ontol. 7(2), 101–113 (2012). https://doi.org/10.1504/IJMSO.2012.050015
Giebler, C., Gröger, C., Hoos, E., Schwarz, H., Mitschang, B.: Modeling data lakes with data vault: practical experiences, assessment, and lessons learned. In: Laender, A.H.F., Pernici, B., Lim, E.-P., de Oliveira, J.P.M. (eds.) ER 2019. LNCS, vol. 11788, pp. 63–77. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33223-5_7
Goil, S., Choudhary, A.: Design and implementation of a scalable parallel system for multidimensional analysis and OLAP. In: Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing (IPPS/SPDP 1999), pp. 576–581 (1999). https://doi.org/10.1109/IPPS.1999.760535
Gómez, L.I., Gómez, S.A., Vaisman, A.A.: A generic data model and query language for spatiotemporal OLAP cube analysis. In: Proceedings of the 15th International Conference on Extending Database Technology (EDBT 2012), pp. 300–311. Association for Computing Machinery, New York, USA (2012). https://doi.org/10.1145/2247596.2247632
Inmon, W., Linstedt, D., Levins, M.: Introduction to data vault architecture. In: Inmon, W., Linstedt, D., Levins, M. (eds.) Data Architecture, 2nd edn., pp. 157–162. Academic Press (2019). https://doi.org/10.1016/B978-0-12-816916-2.00020-6
Jin, R., Vaidyanathan, J., Yang, G., Agrawal, G.: Communication and memory optimal parallel data cube construction. IEEE Trans. Parallel Distrib. Syst. 16, 1105–1119 (2005)
Karayannidis, N., Sellis, T., Kouvaras, Y.: CUBE file: a file structure for hierarchically clustered OLAP cubes. In: Bertino, E., et al. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 621–638. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24741-8_36
Leonhardi, B., Mitschang, B., Pulido, R., Sieb, C., Wurst, M.: Augmenting OLAP exploration with dynamic advanced analytics. In: Proceedings of the 13th International Conference on Extending Database Technology (EDBT 2010), pp. 687–692. Association for Computing Machinery, New York, USA (2010). https://doi.org/10.1145/1739041.1739127
Luo, Z.W., Ling, T.W., Ang, C.H., Lee, S.Y., Cui, B.: Range top/bottom k queries in OLAP sparse data cubes. In: Mayr, H.C., Lazansky, J., Quirchmayr, G., Vogel, P. (eds.) DEXA 2001. LNCS, vol. 2113, pp. 678–687. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44759-8_66
Messaoud, R., Boussaid, O., Loudcher, S.: A multiple correspondence analysis to organize data cubes. Front. Artif. Intell. Appl. 155, 133–146 (2007)
Puonti, M., Raitalaakso, T., Aho, T., Mikkonen, T.: Automating transformations in data vault data warehouse loads. Front. Artif. Intell. Appl. 292, 215–230 (2017). https://doi.org/10.3233/978-1-61499-720-7-215
Salmam, F.Z., Fakir, M., Errattahi, R.: Prediction in OLAP data cubes. J. Inf. Knowl. Manag. 15(02), 449–458 (2016). https://doi.org/10.1142/S0219649216500222
Schneider, S., Frosch-Wilke, D.: Analysis patterns in dimensional data modeling. In: Kannan, R., Andres, F. (eds.) ICDEM 2010. LNCS, vol. 6411, pp. 109–116. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27872-3_17
Thomsen, E.: OLAP Solution: Building Multidimensional Information System. Willey, New York (2002)
Tsai, M.-F., Chu, W.: A multidimensional aggregation object (MAO) framework for computing distributive aggregations. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 45–54. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-45228-7_6
Vitter, J.S., Wang, M.: Approximate computation of multidimensional aggregates of sparse data using wavelets. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data (SIGMOD 1999), pp. 193–204. Association for Computing Machinery, New York, USA (1999). https://doi.org/10.1145/304182.304199
Wang, W., Lu, H., Feng, J., Yu, J.: Condensed cube: an effective approach to reducing data cube size. In: Proceedings of the 18th International Conference on Data Engineering (ICDE 2002), pp. 155–165. IEEE Computer Society, Washington (2002)
Yessad, L., Labiod, A.: Comparative study of data warehouses modeling approaches: Inmon, Kimball and data vault. In: 2016 International Conference on System Reliability and Science (ICSRS), pp. 95–99 (2016). https://doi.org/10.1109/ICSRS.2016.7815845
Zhang, R., Pan, D.: Metadata management based on lifecycle for DW 2.0. In: 2010 8th World Congress on Intelligent Control and Automation, pp. 5154–5157 (2010). https://doi.org/10.1109/WCICA.2010.5554915
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Fomin, M. (2023). Multidimensional Information System Metadata Description Using the “Data Vault” Methodology. In: Vishnevskiy, V.M., Samouylov, K.E., Kozyrev, D.V. (eds) Distributed Computer and Communication Networks. DCCN 2022. Communications in Computer and Information Science, vol 1748. Springer, Cham. https://doi.org/10.1007/978-3-031-30648-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-30648-8_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30647-1
Online ISBN: 978-3-031-30648-8
eBook Packages: Computer ScienceComputer Science (R0)