ABSTRACT
When organizing automated data collection in a data warehouse under the conditions of increasing data volume and complicating the business model of an enterprise, an information system data model control becomes one of the priority tasks. The article discusses a method of metadata repository developing in terms of metadata responsible for describing business objects and the relationships between them. The choice of "Data vault" determines the construction of a data warehouse within the framework of an information system based on the classical design approach with a 3-level data presentation architecture, which includes a data preparation area, or an online data warehouse, data warehouse and thematic data marts. The proposed approach allows organizing data storage within the data warehouse using a metadata repository based on the multidimensional organization principle. The metadata repository is responsible for the data collection process, the data storage process, and the presentation of data for analysis. The metadata repository is presented in the form of a metamodel that is semantically related to the domain of the system, is easily reconstructed in case of changes in the business model of the domain, and allows data marts to be created with the structure of a multidimensional data model based on the Star relational scheme. This allows you to organize the human-computer interaction when describing a metamodel, using mainly knowledge about the structure of the subject area. When describing a metamodel, the first-order predicate calculus language is used, which makes it possible to control the metamodel using a declarative programming style - the "Prolog" language. The key point in the structure of the information system is the way of transition from the "Data vault" model to a multidimensional data representation model based on associative rules of dependence between information objects.
- William Inmon. 1999. Building the Operational Data Store (2nd ed.).Wiley Publishing. DOI: https://doi.org/10.1016/B978-0-12-802044-9.00019-2Google Scholar
- Carlos Costa, Carina Andrade, Maribel Yasmina Santos. 2018. Big Data Warehouses for Smart Industries. In Encyclopedia of Big Data Technologies. Springer, 1--11. DOI: https://doi.org/10.1007/978-3-319-63962-8_204-1Google Scholar
- Ralph Kimball, Margy Ross. 2013. The Data Warehouse Toolkit: the Definitive Guide to Dimensional Modeling (3rd ed.). Wiley Publishing. ISBN: 978-1-118-53080-1Google Scholar
- Eduarda Costa, Carlos Costa, Maribel Yasmina Santos. 2017. Efficient Big Data Modelling and Organization for Hadoop Hive-Based Data Warehouses. In Information Systems. EMCIS 2017. Lecture Notes in Business Information Processing, vol 299. Springer, pp 3--16. DOI: https://doi.org/10.1007/978-3-319-65930-5_1Google Scholar
- Krish Krishnan. 2013. Data Warehausing in the Age of Big Data. Elsevier Inc. DOI: https://doi.org/10.1016/C2012-0-02737-8Google Scholar
- Dariusz Dymek, Wojciech Komnata, Piotr Szwed. 2015. Proposal of a New Data Warehouse Architecture Reference Model. In Beyond Databases, Architectures and Structures. BDAS 2015. Communications in Computer and Information Science, vol 521. Springer. 210--221. DOI: https://doi.org/10.1007/978-3-319-18422-7_19Google Scholar
- Alejandro Mate, Juan Trujillo. 2014. Tracing Conceptual Models' Evolution in Data Warehouses by Uusing the Model Driven Architecture. Computer Standards & Interfaces 36, 5, 831--843. DOI: https://doi.org/10.1016/j.csi.2014.01.004Google ScholarDigital Library
- Alejandro Vaisman, Esteban Zimanyi. 2014. Conceptual Data Warehouse Design. In Data Warehouse Systems. Data-Centric Systems and Applications. Springer, 89--119. DOI: https://doi.org/10.1007/978-3-642-54655-6_4Google Scholar
- Sandeep Singh, Sona Malhotra. 2011. Data Warehouse and its Methods. Journal of Global Research in Computer Science 2, 5, 113--115.Google Scholar
- Maribel Yasmina Santos, Carlos Costa, Joao Galvao, Carina Andrade, Oscar Pastor, Ana Cristina Marcen. 2019. Enhancing Big Data Warehousing for Efficient, Integrated and Advanced Analytics: Visionary Paper. In Information Systems Engineering in Responsible Information Systems. CAiSE 2019. Lecture Notes in Business Information Processing, vol 350. Springer. 215--226. DOI: https://doi.org/10.1007/978-3-030-21297-1_19Google Scholar
- Ranjeev Hari, Suhanya Parthasarathy. 2019. Next Generation Sequencing Data Analysis. In Encyclopedia of Bioinformatics and Computational Biology 3, 157--163. DOI: https://doi.org/10.1016/B978-0-12-809633-8.20093-9Google ScholarCross Ref
- Matthias Jarke, Manfred Jeusfeld, Hans Nissen, Christoph Quix, Martin Staud. 2010. Metamodelling with Datalog and Classes: ConceptBase at the Age of 21. In Object Databases. ICOODB 2009. Lecture Notes in Computer Science, vol 5936. Springer. 95--112. DOI: https://doi.org/10.1007/978-3-642-14681-7_6Google Scholar
- Daniel Linstedt, Michael Olschimke. 2016. Building a Scalable Data Warehouse with Data Vault 2.0, Elsevier Inc. DOI: https://doi.org/10.1016/C2014-0-02486-0Google Scholar
- William Inmon, Daniel Linstedt, Mary Levins. 2019. Introduction to Data Vault Architecture. In Data Architecture (2nd ed.). Elsevier Inc. 157--162. DOI: https://doi.org/10.1016/b978-0-12-816916-2.00020-6Google Scholar
- Reiifa Zhang, Ding Pan. 2010. Metadata Management Based on Lifecycle for DW 2.0. In Proceedings of the World Congress on Intelligent Control and Automation. WCICA 2010, 5154--5157. DOI: https://doi.org/10.1109/WCICA.2010.5554915Google Scholar
- Mikko Puonti, Timo Raitalaakso, Timo Aho, Tommi Mikkonen. 2017. Automating Transformations in Data Vault Data Warehouse Loads. In Frontiers in Artificial Intelligence and Applications, vol. 292. IOS Press. 215--230. DOI: https://doi.org/10.3233/978-1-61499-720-7-215Google Scholar
- Vinay Gautam, N Parimala. 2012. E-Metadata Versioning System for Data Warehouse Schema. International Journal of Metadata, Semantics and Ontologies 7, 2, 101--113. DOI: https://doi.org/10.1504/IJMSO.2012.050015Google ScholarDigital Library
- Lamia Yessad, Aissa Labiod. 2017. Comparative Study of Data Warehouses Modeling Approaches: Inmon, Kimball and Data Vault. In 2016 International Conference on System Reliability and Science, ICSRS 2016. IEEE Inc. 95--99. DOI: https://doi.org/10.1109/ICSRS.2016.7815845Google Scholar
- John Malpas. 1987. Prolog: a Relational Language and its Applications. Prentice-Hall. ISBN:0-13-730805-1Google Scholar
- Gunnar Auth, Eitel von Maur. 2002. A Software Architecture for XML-Based Metadata Interchange in Data Warehouse Systems. In XML-Based Data Management and Multimedia Engineering --- EDBT 2002 Workshops. EDBT 2002. Lecture Notes in Computer Science, vol 2490. Springer. 1--14. DOI: https://doi.org/10.1007/3-540-36128-6_1Google Scholar
- Corinna Giebler, Christoph Groger, Eva Hoos, Holger Schwarz. 2019. Modeling Data Lakes with Data Vault: Practical Experiences, Assessment, and Lessons Learned. In Conceptual Modeling. ER 2019. Lecture Notes in Computer Science, vol 11788. Springer. 63--77. https://doi.org/10.1007/978-3-030-33223-5_7Google Scholar
- Claudia Diamantini, Domenico Potena. 2012. Data Mart Integration at Measure level. In Information Systems: Crossroads for Organization, Management, Accounting and Engineering. Springer, 123--134. DOI: https://doi.org/10.1007/978-3-7908-2789-7_15Google Scholar
- Stephan Schneider, Dirk Frosch-Wilke. 2010. Analysis Patterns in Dimensional Data Modeling. In Data Engineering and Management. ICDEM 2010. Lecture Notes in Computer Science, vol 6411. Springer. 109--116. DOI: https://doi.org/10.1007/978-3-642-27872-3_17Google Scholar
- Maxim Fomin. 2016. Cluster Method of Description of Information System Data Model Based on Multidimensional Approach. In: Distributed Computer and Communication Networks. DCCN 2016. Communications in Computer and Information Science, vol 678. Springer. 657--668. DOI: https://doi.org/10.1007/978-3-319-51917-3_56Google ScholarCross Ref
- Maxim Fomin. 2017. The Application of Classification Schemes While Describing Metadata of the Multidimensional Information System Based on the Cluster Method. In: Distributed Computer and Communication Networks. DCCN 2017. Communications in Computer and Information Science, vol 700. Springer. 307--318. DOI: https://doi.org/10.1007/978-3-319-66836-9_26Google ScholarCross Ref
Index Terms
- Multidimensional Information Systems Metadata Repository Development with a Data Warehouse Structure Using "Data Vault" Methodology
Recommendations
Data warehouse enhancement: A semantic cube model approach
Many data warehouse systems have been developed recently, yet data warehouse practice is not sufficiently sophisticated for practical usage. Most data warehouse systems have some limitations in terms of flexibility, efficiency, and scalability. In ...
An MDA Approach and QVT Transformations for the Integrated Development of Goal-Oriented Data Warehouses and Data Marts
To customize a data warehouse, many organizations develop concrete data marts focused on a particular department or business process. However, the integrated development of these data marts is an open problem for many organizations due to the technical ...
An academic data warehouse
AIC'07: Proceedings of the 7th Conference on 7th WSEAS International Conference on Applied Informatics and Communications - Volume 7There are several benefits that can be reached by developing an academic data warehouse as providing a centralized source of information accessible across different academic units to quickly analyze problems and get satisfactory solutions, supplying the ...
Comments