Abstract
Faceted thesauri group classification terms into hierarchically arranged facets. They enable faceted browsing, a well-known browsing technique that makes it possible to narrowing down digital collections by recursively adding filtering terms from the facet hierarchy. In this paper we develop an approach to achieve faceted browsing in live collections, in which not only the contents but also the thesauri can be constantly reorganized. For this purpose we start by introducing a faceted thesauri-based digital collection model in which users can freely rearrange the hierarchical organizations of facets. Then we analyze how to efficiently react to thesauri reconfigurations by representing all the possible ways of browsing a collection with a finite state machine called navigation automaton. Since, in the worst-case, the number of states in navigation automata can grow exponentially with respect to the collections’ sizes, we propose two indexing strategies to avoid this exponential worst-case complexity: one based on inverted indexes, and another inspired by hierarchical clustering, which makes use of the so-called navigation dendrograms. Some experimental results concerning Clavy, a system for managing digital collections with reconfigurable structures in digital humanities and educational settings, provide evidence that navigation dendrogram organization outperforms the inverted index-based one.
A prior version of this paper has been published in the ISD2016 Proceedings (http://aisel.aisnet.org/isd2014/proceedings2016).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Notice that S and S′ can be the same—when all the resources in R are annotated by t.
- 2.
Notice that, although as indicated earlier, frameworks like Solr [7] support faceted browsing in a straightforward and efficient manner by identifying paths in the thesaurus with terms, in our context these features are useless, since thesauri can be reconfigured anytime, thus invalidating this solution. So we are confined to explicitly evaluating conjunctive queries in each interaction state.
- 3.
clavy.fdi.ucm.es/Clavy/.
- 4.
oda-fec.org/ucm-chasqui.
- 5.
By feasible we mean avoiding cycles in the resulting thesaurus.
- 6.
repositorios.fdi.ucm.es/mnemosine/.
- 7.
repositorios.fdi.ucm.es/CIBERIA.
- 8.
repositorios.fdi.ucm.es/Tropos/.
- 9.
oda-fec.org/nata.
References
Berchtold, S., Böhm, C., Keim, D.-A., Kriegel, H.-P., Xiaowei, X.: Optimal multidimensional query processing using tree striping. In: Proceedings of the 2nd International Conference on Data Warehousing and Knowledge Discovery, pp. 244–257. Springer, London, UK (2000)
Chengkai, L., Ning, Y., Senjuti, B-R., Lekhendro, L., Gautam, D.: Facetedpedia: dynamic generation of query-dependent faceted interfaces for wikipedia. In: Proceedings of the 19th International World Wide Web Conference, pp. 651–660. ACM, Raleigh, NC, USA (2010)
Chodorow, K.: MongoDB: the definitive guide. O’Reilly (2013)
Cigarrán-Recuero, J., Gayoso-Cabada, J., Rodríguez-Artacho, M., Romero-López, D., Sarasa-Cabezuelo, A., Sierra, J.-L.: Assessing semantic annotation activities with formal concept analysis. Expert Syst. Appl. 44(11), 5495–5508 (2014)
Culpepper, J.-S., Moffat, A.: Efficient set intersection for inverted indexing. ACM Trans. Inf. Syst. 29(1), article 1 (2010)
Godin, R., Saunders, G.: Lattice model of browsable data space. Inf. Sci. 40(2), 89–116 (1986)
Grainger, T., Potter, T.: Solr in Action. Manning Publications (2014)
Greene, G.-J., Dunaiski, M., Fischer, B.: Browsing publication data using tag clouds over concept lattices constructed by key-phrase extraction. In: Proceedings of Russian and South African Workshop on Knowledge Discovery Techniques Based on Formal Concept Analysis, pp. 10–22. CEUR, Stellenbosch, South Africa (2015)
Greene, G.-J., Fischer, B.: Interactive tag cloud visualization of software version control repositories. In: Proceedings of the 3rd IEEE Working Conference on Software Visualization, pp. 56–65. IEEE, Raleight, NC, USA (2015)
Greene, G.-J.: A Generic framework for concept-based exploration of semi-structured software engineering data. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, pp. 894–897. ACM, Lincoln, Nebraska, USA (2015)
Hildebrand, M., van Ossenbruggen, J., Hardman, L.: /facet: a browser for heterogeneous semantic web repositories. In: Proceedings of the 5th International Semantic Web Conference, pp. 272–285. Springer, Athens, GA, USA (2006)
Huang, J.-W., Chen, K.-Y., Chen, Y.-C., Yang, K.-N., Hwang, S., Huang, W.-C.: A novel spatial tag cloud using multi-level clustering. J. Inf. Sci. Eng. 30, 687–700 (2014)
Jain, A.-K., Murty, M.-N., Flynn, P.-J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Kriegel H.-P.: Performance comparison of index structures for multi-key retrieval. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 186–196. ACM, Boston, MA (1984)
Kuznetsov, S.: On computing the size of a lattice and related decision problems. Order 18(4), 313–321 (2001)
Li, R., Shenghua, B., Fei, B., Su, Z., Yu, Y.: Towards effective browsing of large scale social annotations. In: Proceedings of 16th International World Wide Web Conference, pp. 943–952. ACM, Banff, Alberta, Canada (2007)
McCandless, M., Hatcher, E., Gospodnetic, O.: Lucene in Action, 2nd edn. Manning Publications (2010)
Perugini, S.: Supporting multiple paths to objects in information hierachies: faceted classification, faceted search, and symbolic links. Inf. Proc. Manag. 46(1), 22–43 (2010)
Radelaar, J., Boor, A.-J., Vandic, D., van Dam, J.-W., Fasinca, F.: Improving search and exploration in tag spaces using automated tag clustering. J. Web Eng. 13(3–4), 277–301 (2014)
Sarmah, A.-K., Hazarika, S.-M., Sinha, S.-K.: Formal concept analysis: current trends and directions. Artif. Intell. Rev. 44(1), 47–86 (2015)
Schraefel, M.-C., Smith, D-A., Owens, A., Russell, A., Harris, C., Wilson, M.: The evolving mSpace platform: leveraging the semantic web on the trail of the memex. In: Proceedings of the 16th Conference on Hypertext, pp. 174–183. ACM, Salzburg, Austria (2005)
Schraefel, M.-C., Wilson, M., Russell, A., Smith, D.-A.: mSpace: improving information access to multimedia domains with multimodal exploratory search. Commun. ACM 49(4), 47–49 (2006)
Sierra, J.-L., Fernández-Valmayor, A., Guinea, M., Hernanz, H.: From research resources to learning objects: process model and virtualization experiences. Education. Tech. Soc. 9(3), 56–68 (2006)
Sierra, J.-L., Fernández-Valmayor, A.: Tagging learning objects with evolving metadata schemas. In: Proceedings of the 8th IEEE International Conference on Advanced Learning Technologies, pp. 829–833. IEEE. Santander, Spain (2008)
Smith, D.-A., Owens, A., Schraefel, M-C., Sinclair, P., Max, P-A., Wilson, A., Rusell, A., Martinez, K., Lewis, P.: Challenges in supporting faceted semantic browsing of multimedia collections. In: Proceedings of the 2nd International Conference on Semantics and Digitial Media Technologies, pp. 280–283. Springer, Genoa, Italy (2007)
Tunkelang, D.: Faceted Search. Morgan & Claypool Publishers (2009)
Uddin, M.-N., Janecek, P.: The implementation of faceted classification in web site searching and browsing. Online Inf. Rev. 31(2), 218–233 (2007)
Way, T., Eklund, P.: Social Tagging for digital libraries using formal concept analysis. In: Proceedings of the 17th International Conference on Concept Lattices and their Applications, pp. 139–150. Sevilla, Spain (2010)
Wei, B., Liu, J., Zheng, Q.: A survey of faceted search. J. Web Eng. 12(1–2), 41–64 (2013)
Yee, K.-P., Swearingen, K., Li, K., Hearst, M.: Faceted metadata for image search and browsing. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 401–408. ACM, Fort Lauderdale, Florida, USA (2003)
Yitzhak, O-B., Golbandj, N., Har’El N. et al.: Beyond basic faceted search. In: Proceedings of the 2008 International Conference on Web Search and Data Minining, pp. 33–44. ACM, Stanford, CA, USA (2008)
Zhang, Z., Li, W., Gurrin, C., Smeaton. A.-F.: Faceted navigation for browsing large video collection. In: Proceedings of the 22nd International Conference on Multimedia Modelling, pp. 412–417. Springer, Miami, USA (2016)
Zobel, J., Moffat, A.: Inverted files for text search engines. ACM Comput. Surv. 33(2) (2006) article 6
Acknowledgements
This work has been supported by the BBVA Foundation (research grant HUM14_251) and by the Spanish Ministry of Economy and Competitiveness (research grant TIN2014-52010-R). The Chasqui repository was created and is maintained by Prof. Mercedes Guinea (currently a researcher at “El Caño” Foundation). The thesaurus used as an example in this work is adapted from the Chasqui’s cataloguing schema. Chasqui’s original software infrastructure was developed by Alfredo Fernández-Valmayor (currently also at “El Caño” Foundation).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this paper
Cite this paper
Gayoso-Cabada, J., Rodríguez-Cerezo, D., Sierra, JL. (2017). Browsing Digital Collections with Reconfigurable Faceted Thesauri. In: Goluchowski, J., Pankowska, M., Linger, H., Barry, C., Lang, M., Schneider, C. (eds) Complexity in Information Systems Development. Lecture Notes in Information Systems and Organisation, vol 22. Springer, Cham. https://doi.org/10.1007/978-3-319-52593-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-52593-8_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52592-1
Online ISBN: 978-3-319-52593-8
eBook Packages: Business and ManagementBusiness and Management (R0)