loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Lisa Ehrlinger 1 ; 2 ; Alexander Gindlhumer 1 ; Lisa-Marie Huber 1 and Wolfram Wöß 1

Affiliations: 1 Johannes Kepler University Linz, Altenberger Straße 69, 4040 Linz, Austria ; 2 Software Competence Center Hagenberg GmbH, Softwarepark 32a, 4232 Hagenberg, Austria

Keyword(s): Data Quality Monitoring, Data Profiling, Automation, Knowledge Graphs, Heterogeneous Data, Sensor Data.

Abstract: High data quality (e.g., completeness, accuracy, non-redundancy) is essential to ensure the trustworthiness of AI applications. In such applications, huge amounts of data is integrated from different heterogeneous sources and complete, global domain knowledge is often not available. This scenario has a number of negative effects, in particular, it is difficult to monitor data quality centrally and manual data curation is not feasible. To overcome these problems, we developed DQ-MeeRKat, a data quality tool that implements a new method to automate data quality monitoring. DQ-MeeRKat uses a knowledge graph to represent a global, homogenized view of local data sources. This knowledge graph is annotated with reference data profiles, which serve as quasi-gold-standard to automatically verify the quality of modified data. We evaluated DQ-MeeRKat on six real-world data streams with qualitative feedback from the data owners. In contrast to existing data quality tools, DQ-MeeRKat does not req uire domain experts to define rules, but can be fully automated. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.152.77.92

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Ehrlinger, L.; Gindlhumer, A.; Huber, L. and Wöß, W. (2021). DQ-MeeRKat: Automating Data Quality Monitoring with a Reference-Data-Profile-Annotated Knowledge Graph. In Proceedings of the 10th International Conference on Data Science, Technology and Applications - DATA; ISBN 978-989-758-521-0; ISSN 2184-285X, SciTePress, pages 215-222. DOI: 10.5220/0010546202150222

@conference{data21,
author={Lisa Ehrlinger. and Alexander Gindlhumer. and Lisa{-}Marie Huber. and Wolfram Wöß.},
title={DQ-MeeRKat: Automating Data Quality Monitoring with a Reference-Data-Profile-Annotated Knowledge Graph},
booktitle={Proceedings of the 10th International Conference on Data Science, Technology and Applications - DATA},
year={2021},
pages={215-222},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010546202150222},
isbn={978-989-758-521-0},
issn={2184-285X},
}

TY - CONF

JO - Proceedings of the 10th International Conference on Data Science, Technology and Applications - DATA
TI - DQ-MeeRKat: Automating Data Quality Monitoring with a Reference-Data-Profile-Annotated Knowledge Graph
SN - 978-989-758-521-0
IS - 2184-285X
AU - Ehrlinger, L.
AU - Gindlhumer, A.
AU - Huber, L.
AU - Wöß, W.
PY - 2021
SP - 215
EP - 222
DO - 10.5220/0010546202150222
PB - SciTePress