Skip to main content

Browsing Digital Collections with Reconfigurable Faceted Thesauri

  • Conference paper
  • First Online:
Complexity in Information Systems Development

Abstract

Faceted thesauri group classification terms into hierarchically arranged facets. They enable faceted browsing, a well-known browsing technique that makes it possible to narrowing down digital collections by recursively adding filtering terms from the facet hierarchy. In this paper we develop an approach to achieve faceted browsing in live collections, in which not only the contents but also the thesauri can be constantly reorganized. For this purpose we start by introducing a faceted thesauri-based digital collection model in which users can freely rearrange the hierarchical organizations of facets. Then we analyze how to efficiently react to thesauri reconfigurations by representing all the possible ways of browsing a collection with a finite state machine called navigation automaton. Since, in the worst-case, the number of states in navigation automata can grow exponentially with respect to the collections’ sizes, we propose two indexing strategies to avoid this exponential worst-case complexity: one based on inverted indexes, and another inspired by hierarchical clustering, which makes use of the so-called navigation dendrograms. Some experimental results concerning Clavy, a system for managing digital collections with reconfigurable structures in digital humanities and educational settings, provide evidence that navigation dendrogram organization outperforms the inverted index-based one.

A prior version of this paper has been published in the ISD2016 Proceedings (http://aisel.aisnet.org/isd2014/proceedings2016).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Notice that S and S′ can be the same—when all the resources in R are annotated by t.

  2. 2.

    Notice that, although as indicated earlier, frameworks like Solr [7] support faceted browsing in a straightforward and efficient manner by identifying paths in the thesaurus with terms, in our context these features are useless, since thesauri can be reconfigured anytime, thus invalidating this solution. So we are confined to explicitly evaluating conjunctive queries in each interaction state.

  3. 3.

    clavy.fdi.ucm.es/Clavy/.

  4. 4.

    oda-fec.org/ucm-chasqui.

  5. 5.

    By feasible we mean avoiding cycles in the resulting thesaurus.

  6. 6.

    repositorios.fdi.ucm.es/mnemosine/.

  7. 7.

    repositorios.fdi.ucm.es/CIBERIA.

  8. 8.

    repositorios.fdi.ucm.es/Tropos/.

  9. 9.

    oda-fec.org/nata.

References

  1. Berchtold, S., Böhm, C., Keim, D.-A., Kriegel, H.-P., Xiaowei, X.: Optimal multidimensional query processing using tree striping. In: Proceedings of the 2nd International Conference on Data Warehousing and Knowledge Discovery, pp. 244–257. Springer, London, UK (2000)

    Google Scholar 

  2. Chengkai, L., Ning, Y., Senjuti, B-R., Lekhendro, L., Gautam, D.: Facetedpedia: dynamic generation of query-dependent faceted interfaces for wikipedia. In: Proceedings of the 19th International World Wide Web Conference, pp. 651–660. ACM, Raleigh, NC, USA (2010)

    Google Scholar 

  3. Chodorow, K.: MongoDB: the definitive guide. O’Reilly (2013)

    Google Scholar 

  4. Cigarrán-Recuero, J., Gayoso-Cabada, J., Rodríguez-Artacho, M., Romero-López, D., Sarasa-Cabezuelo, A., Sierra, J.-L.: Assessing semantic annotation activities with formal concept analysis. Expert Syst. Appl. 44(11), 5495–5508 (2014)

    Article  Google Scholar 

  5. Culpepper, J.-S., Moffat, A.: Efficient set intersection for inverted indexing. ACM Trans. Inf. Syst. 29(1), article 1 (2010)

    Google Scholar 

  6. Godin, R., Saunders, G.: Lattice model of browsable data space. Inf. Sci. 40(2), 89–116 (1986)

    Article  Google Scholar 

  7. Grainger, T., Potter, T.: Solr in Action. Manning Publications (2014)

    Google Scholar 

  8. Greene, G.-J., Dunaiski, M., Fischer, B.: Browsing publication data using tag clouds over concept lattices constructed by key-phrase extraction. In: Proceedings of Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis, pp. 10–22. CEUR, Stellenbosch, South Africa (2015)

    Google Scholar 

  9. Greene, G.-J., Fischer, B.: Interactive tag cloud visualization of software version control repositories. In: Proceedings of the 3rd IEEE Working Conference on Software Visualization, pp. 56–65. IEEE, Raleight, NC, USA (2015)

    Google Scholar 

  10. Greene, G.-J.: A Generic framework for concept-based exploration of semi-structured software engineering data. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, pp. 894–897. ACM, Lincoln, Nebraska, USA (2015)

    Google Scholar 

  11. Hildebrand, M., van Ossenbruggen, J., Hardman, L.: /facet: a browser for heterogeneous semantic web repositories. In: Proceedings of the 5th International Semantic Web Conference, pp. 272–285. Springer, Athens, GA, USA (2006)

    Google Scholar 

  12. Huang, J.-W., Chen, K.-Y., Chen, Y.-C., Yang, K.-N., Hwang, S., Huang, W.-C.: A novel spatial tag cloud using multi-level clustering. J. Inf. Sci. Eng. 30, 687–700 (2014)

    Google Scholar 

  13. Jain, A.-K., Murty, M.-N., Flynn, P.-J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)

    Article  Google Scholar 

  14. Kriegel H.-P.: Performance comparison of index structures for multi-key retrieval. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 186–196. ACM, Boston, MA (1984)

    Google Scholar 

  15. Kuznetsov, S.: On computing the size of a lattice and related decision problems. Order 18(4), 313–321 (2001)

    Article  Google Scholar 

  16. Li, R., Shenghua, B., Fei, B., Su, Z., Yu, Y.: Towards effective browsing of large scale social annotations. In: Proceedings of 16th International World Wide Web Conference, pp. 943–952. ACM, Banff, Alberta, Canada (2007)

    Google Scholar 

  17. McCandless, M., Hatcher, E., Gospodnetic, O.: Lucene in Action, 2nd edn. Manning Publications (2010)

    Google Scholar 

  18. Perugini, S.: Supporting multiple paths to objects in information hierachies: faceted classification, faceted search, and symbolic links. Inf. Proc. Manag. 46(1), 22–43 (2010)

    Article  Google Scholar 

  19. Radelaar, J., Boor, A.-J., Vandic, D., van Dam, J.-W., Fasinca, F.: Improving search and exploration in tag spaces using automated tag clustering. J. Web Eng. 13(3–4), 277–301 (2014)

    Google Scholar 

  20. Sarmah, A.-K., Hazarika, S.-M., Sinha, S.-K.: Formal concept analysis: current trends and directions. Artif. Intell. Rev. 44(1), 47–86 (2015)

    Article  Google Scholar 

  21. Schraefel, M.-C., Smith, D-A., Owens, A., Russell, A., Harris, C., Wilson, M.: The evolving mSpace platform: leveraging the semantic web on the trail of the memex. In: Proceedings of the 16th Conference on Hypertext, pp. 174–183. ACM, Salzburg, Austria (2005)

    Google Scholar 

  22. Schraefel, M.-C., Wilson, M., Russell, A., Smith, D.-A.: mSpace: improving information access to multimedia domains with multimodal exploratory search. Commun. ACM 49(4), 47–49 (2006)

    Article  Google Scholar 

  23. Sierra, J.-L., Fernández-Valmayor, A., Guinea, M., Hernanz, H.: From research resources to learning objects: process model and virtualization experiences. Education. Tech. Soc. 9(3), 56–68 (2006)

    Google Scholar 

  24. Sierra, J.-L., Fernández-Valmayor, A.: Tagging learning objects with evolving metadata schemas. In: Proceedings of the 8th IEEE International Conference on Advanced Learning Technologies, pp. 829–833. IEEE. Santander, Spain (2008)

    Google Scholar 

  25. Smith, D.-A., Owens, A., Schraefel, M-C., Sinclair, P., Max, P-A., Wilson, A., Rusell, A., Martinez, K., Lewis, P.: Challenges in supporting faceted semantic browsing of multimedia collections. In: Proceedings of the 2nd International Conference on Semantics and Digitial Media Technologies, pp. 280–283. Springer, Genoa, Italy (2007)

    Google Scholar 

  26. Tunkelang, D.: Faceted Search. Morgan & Claypool Publishers (2009)

    Google Scholar 

  27. Uddin, M.-N., Janecek, P.: The implementation of faceted classification in web site searching and browsing. Online Inf. Rev. 31(2), 218–233 (2007)

    Article  Google Scholar 

  28. Way, T., Eklund, P.: Social Tagging for digital libraries using formal concept analysis. In: Proceedings of the 17th International Conference on Concept Lattices and their Applications, pp. 139–150. Sevilla, Spain (2010)

    Google Scholar 

  29. Wei, B., Liu, J., Zheng, Q.: A survey of faceted search. J. Web Eng. 12(1–2), 41–64 (2013)

    Google Scholar 

  30. Yee, K.-P., Swearingen, K., Li, K., Hearst, M.: Faceted metadata for image search and browsing. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 401–408. ACM, Fort Lauderdale, Florida, USA (2003)

    Google Scholar 

  31. Yitzhak, O-B., Golbandj, N., Har’El N. et al.: Beyond basic faceted search. In: Proceedings of the 2008 International Conference on Web Search and Data Minining, pp. 33–44. ACM, Stanford, CA, USA (2008)

    Google Scholar 

  32. Zhang, Z., Li, W., Gurrin, C., Smeaton. A.-F.: Faceted navigation for browsing large video collection. In: Proceedings of the 22nd International Conference on Multimedia Modelling, pp. 412–417. Springer, Miami, USA (2016)

    Google Scholar 

  33. Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. 33(2) (2006) article 6

    Google Scholar 

Download references

Acknowledgements

This work has been supported by the BBVA Foundation (research grant HUM14_251) and by the Spanish Ministry of Economy and Competitiveness (research grant TIN2014-52010-R). The Chasqui repository was created and is maintained by Prof. Mercedes Guinea (currently a researcher at “El Caño” Foundation). The thesaurus used as an example in this work is adapted from the Chasqui’s cataloguing schema. Chasqui’s original software infrastructure was developed by Alfredo Fernández-Valmayor (currently also at “El Caño” Foundation).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José-Luis Sierra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing Switzerland

About this paper

Cite this paper

Gayoso-Cabada, J., Rodríguez-Cerezo, D., Sierra, JL. (2017). Browsing Digital Collections with Reconfigurable Faceted Thesauri. In: Goluchowski, J., Pankowska, M., Linger, H., Barry, C., Lang, M., Schneider, C. (eds) Complexity in Information Systems Development. Lecture Notes in Information Systems and Organisation, vol 22. Springer, Cham. https://doi.org/10.1007/978-3-319-52593-8_5

Download citation

Publish with us

Policies and ethics