ABSTRACT
This paper describes LINGUA - an architecture for text processing in Bulgarian. First, the pre-processing modules for tokenisation, sentence splitting, paragraph segmentation, part-of-speech tagging, clause chunking and noun phrase extraction are outlined. Next, the paper proceeds to describe in more detail the anaphora resolution module. Evaluation results are reported for each processing task.
- J. Allen. 1995. Natural Language Understanding. The Benjamin/Cummings Publishing Company, Inc.]] Google ScholarDigital Library
- T. Avgustinova, K. Oliva, and E. Paskaleva. 1989. An HPSG-based parser for bulgarian. In International Seminar on Machine Translation 'Computer and Translation 89', Moscow, Russia.]]Google Scholar
- P. Barkalova. 1997. Bulgarian syntax - known and unknown. Plovdiv University Press, Plovdiv. in Bulgarian.]]Google Scholar
- H. Krushkov. 1997. Modelling and building of machine dictionaries and morphological processors. Ph.D. thesis, University of Plovdiv. in Bulgarian.]]Google Scholar
- R. Mitkov. 1998. Robust pronoun resolution with limited knowledge. In Proceedings of the 18th International Conference on Computational Linguistics (COLING'98)/ACL'98 Conference, pages 869--875, Montreal, Canada.]] Google ScholarDigital Library
- R. Mitkov. 2001. Towards a more consistent and comprehensive evaluation of anaphora resolution algorithms and systems. Towards a more consistent and comprehensive evaluation of anaphora resolution algorithms and systems, (15):253--276.]]Google Scholar
- E. Murat and E. Charniak. 1995. A statistical syntactic disambiguation program and what it learns. CS, 29--95.]]Google Scholar
- C. Orasan, R. Evans, and R. Mitkov. 2000. Enhancing preference-based anaphora resolution with genetic algorithms. In Proceedings of NLP'2000, Patras, Greece.]] Google ScholarDigital Library
- J. Penchev. 1993. Bulgarian Syntax - Government and Binding. Plovdiv University Press, Plovdiv. in Bulgarian.]]Google Scholar
- K. Simov, E. Paskaleva, M. Damova, and M. Slavcheva. 1992. Morpho-assistant - a knowledge based system for bulgarian morphology. In Proceedings of the Third Conference on Applied Natural Language Processing, Trento, Italy.]]Google Scholar
- G. Totkov and Ch. Tanev. 1999. Computerized extraction of word semantics through connected text analysis. In Proc. of the International Workshop DIALOGUE '99, pages 360--365.]]Google Scholar
- A. Voutilainen. 1995. A syntax-based part-of-speech tagger. In Proceedings of the 7th conference of the European Chapter of EACL, Dublin, Ireland.]] Google ScholarDigital Library
- Shallow language processing architecture for Bulgarian
Recommendations
Bulgarian-Polish-Lithuanian corpus: current development
MRTECEEL '09: Proceedings of the Workshop on Multilingual Resources, Technologies and Evaluation for Central and Eastern European LanguagesThis paper discusses the building of the first Bulgarian---Polish---Lithuanian (for short, BG---PL---LT) experimental corpus. The BG---PL---LT corpus (currently under development only for research) contains more than 3 million words and comprises two ...
Urdu language processing: a survey
Extensive work has been done on different activities of natural language processing for Western languages as compared to its Eastern counterparts particularly South Asian Languages. Western languages are termed as resource-rich languages. Core ...
Comments