Skip to main content
Log in

The method for automatically forming a rubricator of a full-text document collection

  • Published:
Automatic Documentation and Mathematical Linguistics Aims and scope

Abstract

The paper suggests a new method for automatically forming the rubricator of a full-text document collection that is applicable for polythematic massifs of scientific and technical documents without limitations on document size in the absence of specialized information for formalizing their content.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Sokolova, N.V., Otkrytye bibliotechnye sistemy: resheniya po integratsii elektronnykh bibliotek i avtomatizirovannykh bibliotechnylh system (Opened Librarian Systems: Solutions on Integration of Electronic Libraries and Automated Librarian Systems) [an Electronic Resource], Sokolova, N.V. and Usmanov, R.T., Electronic Textual Data (Electronic Textual Data), Sudak, 2006, Access Mode: http://gpntb.ru/win/interevents/ crimea 2006/disk2/190.pdf, free access.

  2. Rossiiskaya gosudarstvennaya biblioteka (Russian State Library) [an Electronic Resource], Center of Information Technologies of the RSL; Electronic Data, Vlasenko, T.V., Ed., Kozlova, N.V., Web-master, Moscow: Ros. gos. b-ka, 1997, Access Mode: http://www.rsl.ru, free access.

    Google Scholar 

  3. Nauchnaya electronnaya biblioteka eLIBRARY.RU (eLIBRARY Scientific Electronic Library), [Electronic Resource], Electronic Data, Moscow: 2008, Access Mode: http://elibrary.ru.

  4. Otkrytye sistemy: mnogopredmet. nauch. zhurn. (Opened Systems: a Multisubject Scientific Journal) [an Electronic Resource], Electronic Journal; Access Mode: http://www.osp.ru/os, free access.

  5. Elektronnaya biblioteka tekhnicheskoi literatury po informatsionnym tekhnologiyam CITFORUM (CITFORUM Electronic Library of Technical Literature on Information Technologies), [an Electronic Resource], Electronic Textual and Graphic Data, Access Mode: http://www.citforum.ru, free access.

  6. Universal’naya desyatichnaya klassifikatsiya. UDK: sokr. izd. (Universal Decimal Classification. UDC: an Shortend Edition), Moscow: VINITI RAS, 2006, p. 148.

  7. Sukiasyan, E.R., Novye tablitsy bibliotechno-bibliograficheskoi klassifikatsii. Organizatsiya i tekhnologiya ispol’zovaniya. Metodicheskiye rekomendatsii (New Tables of Librarian-Bibliographic Classification. Organization and Technology of Use. Methodical Recommendations), Moscow: Libereya, 2005.

    Google Scholar 

  8. Gosudarstvennyi rubrikator nauchno-tekhnicheskoi informatsii (State Rubricator of Scientific and Technical Information), All-Russian Institute of Scientific and Technical Information, 5-th ed, Moscow: VINITI, 2001.

  9. Otkrytaya russkaya elektronnaya biblioteka (Open Russian Electronic Library) [an Electronic Resource], Electronic Data, Moscow: Ros. gos. b-ka, 1999, Access Mode: http://orel.rsl.ru, free.

  10. Rossiiskii seminar po otsenke metodov informatsionnogo poiska: Trudy Vtorogo rossiiskogo seminara ROMIP-2004 (Russian Seminar on Evaluation of Information Search Methods: Proceedings of the Second Russian Seminar ROMIP-2004), Pushchino, 2004.

  11. Sebastiani, F., Machine Learning in Automated Text Categorization, ACM Computing Surveys, 2002, vol. 34, no. 1.

  12. Manning, C.D. and Schutze, H., Foundations of Statistical Natural Language Processing, Cambridge: MIT Press, 1999.

    MATH  Google Scholar 

  13. Kovalenko, A., Veroyatnostnyi morfologicheskii analizator russkogo i ukrainskogo yazykov (Probabilistic Morphologic Analyzer of Russian and Ukrainian Languages) [an Electronic Resource], Electronic Textual Data, Access Mode: http://linguist.nm.ru/stemka/stemka.html, free access.

  14. Algoritm vydeleniya psevdoosnov Martina Portera (Algorithm for Distinguishing Pseudostems of Martin Porter) [an Electronic Resource], Electronic Data, Access Mode: http://snowball.sourceforge.net, free access.

  15. Kural, Y., Robertson, S., and Jones, S., Deciphering Clusters Representations, Information Processing and Management, 2001, vol. 37, pp. 593–601.

    Article  MATH  Google Scholar 

  16. Dittenbach, M., Rauber, A., and Merkl, D., Uncovering Hierarchical Structure in Data Using the Growing Hierarchical Self-Organizing Map, Neurocomputing, 2002, vol. 48, pp. 199–216.

    Article  MATH  Google Scholar 

  17. Torra, V., Miyamoto, S., and Lanau, S., Exploration of Textual Document Archives Using a Fuzzy Hierarchical Clustering Algorithm in the GAMBAL System, Information Processing and Management, 2005, vol. 41, pp. 587–598.

    Article  MATH  Google Scholar 

  18. Mendes, M.E.S. Sacks, L., Dynamic Knowledge Representation for e-Learning Applications, Proc. of the 2001 BISC International Workshop on Fuzzy Logic and the Internet (FLINT’2001), Berkeley, 2001, pp. 176–181.

  19. Peskova, O.V., Developing the Method for Automatically Forming a Rubricator of Full-Text Documents: Cand. Sci. (Tech. Sciences) Dissertation Moscow, 2008.

  20. Peskova, O.V., Methods for Automatically Classifying Electronic Textual Documents without Teaching, NTI, ser. 2, 2006, no. 12, pp. 21–32.

  21. Shivarov, A.E., Abramov, G.V., Peskova, O.V., and Belostotskii, N.A., Automated Librarian-Information System of the Technical University, Vestnik MGTU im. N.E. Baumana. Proborostroyeniye, 2007, no. 4, pp. 21–32.

  22. Certificate about the Official Registration of a Program for EC no 2007610196. Automated Librarian-Information System “Yauza,” Shivarov, A.E., Abramov, G.V., Belostotskii, N.A., and Peskova, O.V., Moscow, 2007.

Download references

Authors

Additional information

Original Russian Text © O.V. Peskova, 2008, published in Nauchno-Tekhnicheskaya Informatsiya, Seriya 2, 2008, No. 10, pp. 25–32.

About this article

Cite this article

Peskova, O.V. The method for automatically forming a rubricator of a full-text document collection. Autom. Doc. Math. Linguist. 42, 248–256 (2008). https://doi.org/10.3103/S0005105508050026

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.3103/S0005105508050026

Keywords

Navigation