skip to main content
10.1145/3373722.3373777acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsisConference Proceedingsconference-collections
research-article

Multidimensional Information Systems Metadata Repository Development with a Data Warehouse Structure Using "Data Vault" Methodology

Authors Info & Claims
Published:22 January 2020Publication History

ABSTRACT

When organizing automated data collection in a data warehouse under the conditions of increasing data volume and complicating the business model of an enterprise, an information system data model control becomes one of the priority tasks. The article discusses a method of metadata repository developing in terms of metadata responsible for describing business objects and the relationships between them. The choice of "Data vault" determines the construction of a data warehouse within the framework of an information system based on the classical design approach with a 3-level data presentation architecture, which includes a data preparation area, or an online data warehouse, data warehouse and thematic data marts. The proposed approach allows organizing data storage within the data warehouse using a metadata repository based on the multidimensional organization principle. The metadata repository is responsible for the data collection process, the data storage process, and the presentation of data for analysis. The metadata repository is presented in the form of a metamodel that is semantically related to the domain of the system, is easily reconstructed in case of changes in the business model of the domain, and allows data marts to be created with the structure of a multidimensional data model based on the Star relational scheme. This allows you to organize the human-computer interaction when describing a metamodel, using mainly knowledge about the structure of the subject area. When describing a metamodel, the first-order predicate calculus language is used, which makes it possible to control the metamodel using a declarative programming style - the "Prolog" language. The key point in the structure of the information system is the way of transition from the "Data vault" model to a multidimensional data representation model based on associative rules of dependence between information objects.

References

  1. William Inmon. 1999. Building the Operational Data Store (2nd ed.).Wiley Publishing. DOI: https://doi.org/10.1016/B978-0-12-802044-9.00019-2Google ScholarGoogle Scholar
  2. Carlos Costa, Carina Andrade, Maribel Yasmina Santos. 2018. Big Data Warehouses for Smart Industries. In Encyclopedia of Big Data Technologies. Springer, 1--11. DOI: https://doi.org/10.1007/978-3-319-63962-8_204-1Google ScholarGoogle Scholar
  3. Ralph Kimball, Margy Ross. 2013. The Data Warehouse Toolkit: the Definitive Guide to Dimensional Modeling (3rd ed.). Wiley Publishing. ISBN: 978-1-118-53080-1Google ScholarGoogle Scholar
  4. Eduarda Costa, Carlos Costa, Maribel Yasmina Santos. 2017. Efficient Big Data Modelling and Organization for Hadoop Hive-Based Data Warehouses. In Information Systems. EMCIS 2017. Lecture Notes in Business Information Processing, vol 299. Springer, pp 3--16. DOI: https://doi.org/10.1007/978-3-319-65930-5_1Google ScholarGoogle Scholar
  5. Krish Krishnan. 2013. Data Warehausing in the Age of Big Data. Elsevier Inc. DOI: https://doi.org/10.1016/C2012-0-02737-8Google ScholarGoogle Scholar
  6. Dariusz Dymek, Wojciech Komnata, Piotr Szwed. 2015. Proposal of a New Data Warehouse Architecture Reference Model. In Beyond Databases, Architectures and Structures. BDAS 2015. Communications in Computer and Information Science, vol 521. Springer. 210--221. DOI: https://doi.org/10.1007/978-3-319-18422-7_19Google ScholarGoogle Scholar
  7. Alejandro Mate, Juan Trujillo. 2014. Tracing Conceptual Models' Evolution in Data Warehouses by Uusing the Model Driven Architecture. Computer Standards & Interfaces 36, 5, 831--843. DOI: https://doi.org/10.1016/j.csi.2014.01.004Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Alejandro Vaisman, Esteban Zimanyi. 2014. Conceptual Data Warehouse Design. In Data Warehouse Systems. Data-Centric Systems and Applications. Springer, 89--119. DOI: https://doi.org/10.1007/978-3-642-54655-6_4Google ScholarGoogle Scholar
  9. Sandeep Singh, Sona Malhotra. 2011. Data Warehouse and its Methods. Journal of Global Research in Computer Science 2, 5, 113--115.Google ScholarGoogle Scholar
  10. Maribel Yasmina Santos, Carlos Costa, Joao Galvao, Carina Andrade, Oscar Pastor, Ana Cristina Marcen. 2019. Enhancing Big Data Warehousing for Efficient, Integrated and Advanced Analytics: Visionary Paper. In Information Systems Engineering in Responsible Information Systems. CAiSE 2019. Lecture Notes in Business Information Processing, vol 350. Springer. 215--226. DOI: https://doi.org/10.1007/978-3-030-21297-1_19Google ScholarGoogle Scholar
  11. Ranjeev Hari, Suhanya Parthasarathy. 2019. Next Generation Sequencing Data Analysis. In Encyclopedia of Bioinformatics and Computational Biology 3, 157--163. DOI: https://doi.org/10.1016/B978-0-12-809633-8.20093-9Google ScholarGoogle ScholarCross RefCross Ref
  12. Matthias Jarke, Manfred Jeusfeld, Hans Nissen, Christoph Quix, Martin Staud. 2010. Metamodelling with Datalog and Classes: ConceptBase at the Age of 21. In Object Databases. ICOODB 2009. Lecture Notes in Computer Science, vol 5936. Springer. 95--112. DOI: https://doi.org/10.1007/978-3-642-14681-7_6Google ScholarGoogle Scholar
  13. Daniel Linstedt, Michael Olschimke. 2016. Building a Scalable Data Warehouse with Data Vault 2.0, Elsevier Inc. DOI: https://doi.org/10.1016/C2014-0-02486-0Google ScholarGoogle Scholar
  14. William Inmon, Daniel Linstedt, Mary Levins. 2019. Introduction to Data Vault Architecture. In Data Architecture (2nd ed.). Elsevier Inc. 157--162. DOI: https://doi.org/10.1016/b978-0-12-816916-2.00020-6Google ScholarGoogle Scholar
  15. Reiifa Zhang, Ding Pan. 2010. Metadata Management Based on Lifecycle for DW 2.0. In Proceedings of the World Congress on Intelligent Control and Automation. WCICA 2010, 5154--5157. DOI: https://doi.org/10.1109/WCICA.2010.5554915Google ScholarGoogle Scholar
  16. Mikko Puonti, Timo Raitalaakso, Timo Aho, Tommi Mikkonen. 2017. Automating Transformations in Data Vault Data Warehouse Loads. In Frontiers in Artificial Intelligence and Applications, vol. 292. IOS Press. 215--230. DOI: https://doi.org/10.3233/978-1-61499-720-7-215Google ScholarGoogle Scholar
  17. Vinay Gautam, N Parimala. 2012. E-Metadata Versioning System for Data Warehouse Schema. International Journal of Metadata, Semantics and Ontologies 7, 2, 101--113. DOI: https://doi.org/10.1504/IJMSO.2012.050015Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Lamia Yessad, Aissa Labiod. 2017. Comparative Study of Data Warehouses Modeling Approaches: Inmon, Kimball and Data Vault. In 2016 International Conference on System Reliability and Science, ICSRS 2016. IEEE Inc. 95--99. DOI: https://doi.org/10.1109/ICSRS.2016.7815845Google ScholarGoogle Scholar
  19. John Malpas. 1987. Prolog: a Relational Language and its Applications. Prentice-Hall. ISBN:0-13-730805-1Google ScholarGoogle Scholar
  20. Gunnar Auth, Eitel von Maur. 2002. A Software Architecture for XML-Based Metadata Interchange in Data Warehouse Systems. In XML-Based Data Management and Multimedia Engineering --- EDBT 2002 Workshops. EDBT 2002. Lecture Notes in Computer Science, vol 2490. Springer. 1--14. DOI: https://doi.org/10.1007/3-540-36128-6_1Google ScholarGoogle Scholar
  21. Corinna Giebler, Christoph Groger, Eva Hoos, Holger Schwarz. 2019. Modeling Data Lakes with Data Vault: Practical Experiences, Assessment, and Lessons Learned. In Conceptual Modeling. ER 2019. Lecture Notes in Computer Science, vol 11788. Springer. 63--77. https://doi.org/10.1007/978-3-030-33223-5_7Google ScholarGoogle Scholar
  22. Claudia Diamantini, Domenico Potena. 2012. Data Mart Integration at Measure level. In Information Systems: Crossroads for Organization, Management, Accounting and Engineering. Springer, 123--134. DOI: https://doi.org/10.1007/978-3-7908-2789-7_15Google ScholarGoogle Scholar
  23. Stephan Schneider, Dirk Frosch-Wilke. 2010. Analysis Patterns in Dimensional Data Modeling. In Data Engineering and Management. ICDEM 2010. Lecture Notes in Computer Science, vol 6411. Springer. 109--116. DOI: https://doi.org/10.1007/978-3-642-27872-3_17Google ScholarGoogle Scholar
  24. Maxim Fomin. 2016. Cluster Method of Description of Information System Data Model Based on Multidimensional Approach. In: Distributed Computer and Communication Networks. DCCN 2016. Communications in Computer and Information Science, vol 678. Springer. 657--668. DOI: https://doi.org/10.1007/978-3-319-51917-3_56Google ScholarGoogle ScholarCross RefCross Ref
  25. Maxim Fomin. 2017. The Application of Classification Schemes While Describing Metadata of the Multidimensional Information System Based on the Cluster Method. In: Distributed Computer and Communication Networks. DCCN 2017. Communications in Computer and Information Science, vol 700. Springer. 307--318. DOI: https://doi.org/10.1007/978-3-319-66836-9_26Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Multidimensional Information Systems Metadata Repository Development with a Data Warehouse Structure Using "Data Vault" Methodology

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Other conferences
      CSIS'2019: Proceedings of the XI International Scientific Conference Communicative Strategies of the Information Society
      October 2019
      176 pages
      ISBN:9781450376709
      DOI:10.1145/3373722

      Copyright © 2019 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 January 2020

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      CSIS'2019 Paper Acceptance Rate30of50submissions,60%Overall Acceptance Rate30of50submissions,60%
    • Article Metrics

      • Downloads (Last 12 months)15
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader