ABSTRACT
The quality of the data and metadata affects the interoperability of the collections and the quality of all processing. Our metadata quality metric helps the metadata harvester collection administrators detecting and improving the weaknesses of their metadata, and harvesters locating the most problematic collections, in terms of metadata quality, and prompt their administrators to improve their metadata. We extended and used an adaptive quantitative metadata quality metric and a tool to implement it. In controlled values, their value distribution is considered, and in free text values the length of their description. Moreover, we also consider additional information in the OAI-PMH XML responces, that is not normally mapped in metadata elements, but still contains metadata information, such as XML attributes. We used the tool to make quality observations, to examine collections for patterns and irregularities and to produce the appropriate advice for the collection administrators. Some of these observations are demonstrated here. We compared the reported quality over a 3-year period, to get a general quantitative and qualitative feeling of the diversity in the record descriptions, and the changes in their quality during their lifetime. We verified the assumption that the quality increases over time: usually by a tiny amount, in every collection, and by a lot on a small number of collections. Also, the lower quality collections are the ones that stop responding and vanish.
- Beall, J., "Metadata and data quality problems in the digital library," (2005) Journal of Digital Information, vol. 6 No. 3.Google Scholar
- Bui, Y. & Park, J., "An assessment of metadata quality: a case study of the National Science Digital Library Metadata Repository," (2005) In Haidar Moukdad (Ed.) CAIS/ACSI 2006 Information Science Revisited: Approaches to Innovation. Proceedings of the 2005 annual conference of the Canadian Association for Information Science held with the Congress of the Social Sciences and Humanities of Canada at York University, Toronto, Ontario.Google Scholar
- Daas, P. J. H. & Ossen, S. J. L., "Metadata quality evaluation of secondary data sources," (2011) International Journal for Quality Research, vol. 5, pp.57--66.Google Scholar
- Fuhr, N., Tsakonas, G., Aalberg, T., Agosti, M., Hansen, P., Kapidakis, S., Klas, P., Kovács, L, Landoni, M., Micsik, A., Papatheodorou, C., Peters C. and Sølvberg, I., "Evaluation of Digital Libraries", (2007) International Journal of Digital Library, Springer-Verlag, vol. 8, no 1, November 2007, pp. 21--38. Google ScholarDigital Library
- Hillmann, D. I. & Phipps, J., "Application profiles: exposing and enforcing metadata quality," (2007) Proceedings of the International Conference on Dublin Core and Metadata Applications, Singapore. Google ScholarDigital Library
- Hughes, B., "Metadata quality evaluation: experience from the open language archives community," (2005) Berlin: Springer. Lecture Notes in Computer Science vol. 3334. ISBN 978-3-540-24030-3. doi: 10.1007/b104284.Google Scholar
- Kapidakis, S., "Comparing Metadata Quality in the Europeana Context," (2012) Proceedings of the 5th ACM international conference on PErvasive Technologies Related to Assistive Environments (PETRA 2012), Heraklion, Greece, June 6-8 2012, ACM International Conference Proceeding Series; vol. 661. Google ScholarDigital Library
- Moreira, B. L., Goncalves, M. A., Laender, A. H. F. & Fox, E. A. "Automatic evaluation of digital libraries with 5SQual," (2009) Journal of Informetrics, vol. 3, 2, pp. 102--123.Google Scholar
- Ochoa, X. & Duval, E., "Towards automatic evaluation of metadata quality in digital repositories," (2006) Lecture Notes in Computer Science (LNCS), Volume 4231/2006, pp. 372--381. ISSN 0302-9743.doi: 10.1007/11908883_44. Google ScholarDigital Library
- Ochoa, X. & Duval, E., "Automatic evaluation of metadata quality in digital repositories," (2009). International Journal on Digital Libraries, vol. 10(2/3), pp. 67--91. Google ScholarDigital Library
- Ohren, O. et al (2010) D6.2 EuropeanaLocal Evaluation and progress report, V2. Project ECP-2007-DILI-517009.Google Scholar
- Park, J., "Metadata quality in digital repositories: A survey of the current state of the art," (2009) Cataloging & Classification Quarterly, vol. 47(3), pp. 213--228.Google Scholar
- Park, J., & Tosaka, Y., "Metadata quality control in digital repositories and collections: criteria, semantics, and mechanisms," (2010) Cataloging & Classification Quarterly, vol. 48(8), pp. 696--715.Google Scholar
- Vullo, G., Clavel, G., Ferro, N., Higgins, S., van Horik, R., Horstmann, W. & Kapidakis, S., "Quality interoperability within digital libraries: the DL.org perspective," (2010) In: 2nd DL.org Workshop in conjunction with ECDL 2010, 9--10 Sep 2010, Glasgow, UK.Google Scholar
- Weagley, J., Gelches, E., & Park, J., "Interoperability and Metadata Quality in Digital Video Repositories: A Study of Dublin Core," (2010) Journal of Library Metadata, vol. 10, no.1, pp. 37--57.Google Scholar
- Zhang, Y., "Developing a holistic model for digital library evaluation," (2010) Journal of the American Society for Information Science and Technology, vol. 61, 1, pp. 88--110. Google ScholarDigital Library
Index Terms
- Rating quality in metadata harvesting
Recommendations
A Learning Quality Metadata approach
This paper presents the LQM metadata schema, an extension of the IEEE LOM standard. LQM is capable of registering information related to the quality of virtual education resources. As a complement, we have developed a cataloging and evaluation tool ...
Metrics for metadata quality assurance and their implications for digital libraries
ICADL'11: Proceedings of the 13th international conference on Asia-pacific digital libraries: for cultural heritage, knowledge dissemination, and future creationThis study aims at developing a set of common metrics for metadata quality at data element level. The proposed metrics are used to assure the quality for metadata that converted from heterogeneous metadata formats and sources into a Dublin Core based ...
Repeated Values on Collections Harvested using the Open Archive Initiative Protocol for Metadata Harvesting
MEDES '19: Proceedings of the 11th International Conference on Management of Digital EcoSystemsLibraries use repeated values to always denote each entity or group of entities in a specific way. When resources have metadata elements with the exact same value, their correlation is made obvious, making the retrieval of all matching metadata records ...
Comments