To read this content please select one of the options below:

Retrieval of bibliographic records using Apache Lucene

Branko Milosavljević (Faculty of Technical Sciences, Novi Sad, Serbia)
Danijela Boberić (Faculty of Sciences, Novi Sad, Serbia)
and
Dušan Surla (Faculty of Sciences, Novi Sad, Serbia)

The Electronic Library

ISSN: 0264-0473

Article publication date: 10 August 2010

1153

Abstract

Purpose

The aim of the research is modeling and implementing a software component for the retrieval of bibliographic records using the Apache Lucene retrieval engine.

Design/methodology/approach

Object‐oriented methodology is used for modeling and implementation of the bibliographic record retrieval engine. Modeling is carried out in the CASE tool that supports the unified modeling language (UML 2.0), while the implementation is using the Java programming language and open source components.

Findings

The result is a software component for the retrieval of bibliographic records that are independent of the bibliographic format used in cataloging. It features great flexibility in terms of configuring search types without the need to change the software implementation.

Research limitations/implications

One of the constraints of this system relates to the problem of searching linking entry fields. UNIMARC format defines fields used to link the item being cataloged to another bibliographic item, so those fields may contain other fields, which can be termed secondary fields. In this proposed solution, secondary fields are treated as all other fields and there is no information whether the search term belongs to the secondary or a regular field.

Practical implications

The proposed solution is integrated into library information system BISIS, version 4. This version of the BISIS system is in use at university, public and special libraries. By introducing this version, system performance as well as flexibility of the indexing process are improved and at the same time librarians are able to perform sophisticated and effective retrieval of bibliographic records.

Originality/value

The contribution of this work is in the design of a customizable record retrieval component. It is configured by means of an XML document for specifying mapping rules between subfields of the bibliographic record format and search types. By using XML it is possible to add new mapping rules without additional programming. In addition, great attention has been paid to the indexing of subfields that contain punctuation marks having special semantic meanings for librarians and the transliteration between Cyrillic and Latin scripts. Also, originality of this work lies in using the Apache Lucene search engine, which facilitates building highly flexible and efficient retrieval systems.

Keywords

Citation

Milosavljević, B., Boberić, D. and Surla, D. (2010), "Retrieval of bibliographic records using Apache Lucene", The Electronic Library, Vol. 28 No. 4, pp. 525-539. https://doi.org/10.1108/02640471011065355

Publisher

:

Emerald Group Publishing Limited

Copyright © 2010, Emerald Group Publishing Limited

Related articles