Skip to main content
Log in

Cross-Domain Effects on Parse Selection for Precision Grammars

  • Published:
Research on Language and Computation

Abstract

We examine the impact of domain on parse selection accuracy, in the context of precision HPSG parsing using the English Resource Grammar, using two training corpora and four test corpora and evaluating using exact tree matches as well as dependency F-scores. In addition to determining the relative impact of in- vs. cross-domain parse selection training on parser performance, we propose strategies to avoid cross-domain performance penalty when limited in-domain data is available. Our work supports previous research showing that in-domain training data significantly improves parse selection accuracy, and that it provides greater parser accuracy than an out-of-domain training corpus of the same size, but we verify experimentally that this holds for a handcrafted grammar, observing a 10–16% improvement in exact match and 5–6% improvement in dependency F-score by using a domain-matched training corpus. We also find it is possible to considerably improve parse selection accuracy through construction of even small-scale in-domain treebanks, and learning of parse selection models over in-domain and out-of-domain data. Naively adding an 11,000-token in-domain training corpus boosts dependency F-score by 2–3% over using solely out-of-domain data. We investigate more sophisticated strategies for combining data from these sources to train models: weighted linear interpolation between the single-domain models, and training a model from the combined data, optionally duplicating the smaller corpus to give it a higher weighting. The most successful strategy is training a monolithic model after duplicating the smaller corpus, which gives an improvement over a range of weightings, but we also show that the optimal value for these parameters can be estimated on a case-by-case basis using a cross-validation strategy. This domain-tuning strategy provides a further performance improvement of up to 2.3% for exact match and 0.9% for dependency F-score compared to the naive combination strategy using the same data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Baldridge, J., & Osborne, M. (2003). Active learning for HPSG parse selection. In Proceedings of the seventh conference on natural language learning (pp. 17–24). Edmonton, Canada.

  • Bikel, D. M. (2002). Design of a multi-lingual, parallel-processing statistical parsing engine. In Proceedings of the second international conference on human language technology research (pp. 178–182). San Francisco, USA.

  • Black, E., Abney, S., Flickinger, D., Gdaniec, C., Grishman, R., Harrison, P., Hindle, D., Ingria, R., Jelinek, F., Klavans, J., Liberman, M., Marcus, M., Roukos, S., Santorini, B., & Strzalkowski, T. (1991). Procedure for quantitatively comparing the syntactic coverage of English grammars. In Proceedings of the workshop on speech and natural language (pp. 306–311). Pacific Grove, USA.

  • Blitzer, J., McDonald, R., & Pereira, F. (2006). Domain adaptation with structural correspondence learning. In Proceedings of EMNLP 2006 (pp. 120–128). Sydney, Australia.

  • Böhmová A., Hajič J., Hajičová E., Hladká B. (2003) The Prague dependency treebank: A three level annotation scenario. In: Abeillé A. (ed.) Treebanks: Building and using parsed corpora. Springer, Berlin

    Google Scholar 

  • Bouma G., van Noord G., Malouf R. (2001) Alpino. Wide-coverage computational analysis of Dutch. In: Daelemans W., Sima-an K., Veenstra J., Zavrel J. (eds.) Computational linguistics in the Netherlands. Rodopi, Amsterdam, The Netherlands, pp 45–59

    Google Scholar 

  • Brants, S., Dipper, S., Hansen, S., Lezius, W., & Smith, G. (2002). The TIGER treebank. In Proceedings of the first workshop on treebanks and linguistic theories. Sozopol, Bulgaria.

  • Brants, T. (2000). TnT—A statistical part-of-speech tagger. In Proceedings of the 6th ACL conference on applied natural language processing (pp. 224–231). Seattle, USA.

  • Briscoe, T., & Carroll, J. (2006). Evaluating the accuracy of an unlexicalized statistical parser on the PARC DepBank. In Proceedings of the COLING/ACL 2006 poster sessions (pp. 41–48). Sydney, Australia.

  • Callmeier U. (2000) PET—A platform for experimentation with efficient HPSG processing techniques. Natural Language Engineering 6(1): 99–107

    Article  Google Scholar 

  • Carter, D. (1997). The TreeBanker. A tool for supervised training of parsed corpora. In Proceedings of the workshop on computational environments for grammar development and linguistic engineering (pp. 9–15). Madrid, Spain.

  • Charniak, E. (2000). A maximum-entropy-inspired parser. In Proceedings of NAACL 2000 (pp. 132–139). Seattle, USA.

  • Charniak, E., & Johnson, M. (2005). Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of ACL 2005 (pp. 173–180). Ann Arbor, USA.

  • Clark, S., & Curran, J. R. (2007a). Formalism-independent parser evaluation with CCG and DepBank. In Proceedings of ACL 2007 (pp. 248–255). Prague, Czech Republic.

  • Clark S., Curran J. R. (2007b) Wide-coverage efficient statistical parsing with CCG and log-linear models. Computational Linguistics 33(4): 493–552

    Article  Google Scholar 

  • Clegg, A., & Shepherd, A. (2005). Evaluating and integrating treebank parsers on a biomedical corpus. In Proceedings of the ACL 2005 workshop on software (pp. 14–33). Ann Arbor, USA.

  • Collins, M. (1997). Three generative, lexicalised models for statistical parsing. In Proceedings of ACL 1997 (pp. 16–23). Madrid, Spain.

  • Collins, M. (1999). Head-driven statistical models for natural language parsing. Ph.D. thesis, University of Pennsylvania.

  • Copestake, A., & Flickinger, D. (2000). An open source grammar development environment and broad-coverage English grammar using HPSG. In International Conference on Language Resources and Evaluation.

  • Copestake, A., Flickinger, D., Sag, I. A., & Pollard, C. (2005). Minimal recursion semantics: An introduction. Research on Language and Computation, pp. 281–332.

  • Dridan, R. (2009). Using lexical statistics to improve HPSG parsing. Ph.D. thesis, Saarland University.

  • Dridan, R., & Baldwin, T. (2010). Unsupervised parse selection for HPSG. In Proceedings of the 2010 conference on empirical methods in natural language processing (EMNLP 2010) (pp. 694–704). Boston, USA.

  • Finkel, J. R., & Manning, C. D. (2009). Hierarchical Bayesian domain adaptation. In Proceedings of HLT-NAACL 2009 (pp. 602–610). Boulder, USA.

  • Flickinger D. (2000) On building a more efficient grammar by exploiting types. Natural Language Engineering 6(1): 15–28

    Article  Google Scholar 

  • Flickinger D. (2011) Accuracy vs. Robustness in grammar engineering. In: Bender E. M., Arnold J. E. (eds.) Language from a cognitive perspective: Grammar usage, and processing. CSLI Publications, Stanford, pp 31–50

    Google Scholar 

  • Flickinger, D., Bhayani, R., & Peters, S. (2009). Sentence boundary detection in spoken dialogue. Technical report, Stanford University, TR-09-06.CSLI.

  • Gildea, D. (2001). Corpus variation and parser performance. In Proceedings of EMNLP 2001 (pp. 167–202). Pittsburgh, USA.

  • Hara, T., Miyao, Y., & Tsujii, J. (2005). Adapting a probabilistic disambiguation model of an HPSG parser to a new domain. In Proceedings of IJCNLP 2005 (pp. 99–210). Jeju Island, Korea.

  • Hara, T., Miyao, Y., & Tsujii, J. (2007). Evaluating impact of re-training a lexical disambiguation model on domain adaptation of an HPSG parser. In Proceedings of IWPT ’07 (pp. 11–22). Prague, Czech Republic.

  • Honnibal, M., Nothman, J., & Curran, J. R. (2009). Evaluating a statistical CCG parser on Wikipedia. In People’s Web ’09: Proceedings of the 2009 Workshop on The People’s Web Meets NLP (pp. 38–41). Singapore.

  • Kaplan, R., Riezler, S., King, T. H., Maxwell III, J. T., Vasserman, A., & Crouch, R. (2004). Speed and accuracy in shallow and deep stochastic parsing. In Proceedings of HLT-NAACL 2004 (pp. 97–104). Boston, USA.

  • Kingsbury, P., Palmer, M., & Marcus, M. (2002). Adding semantic annotation to the Penn TreeBank. In Proceedings of the human language technology 2002 conference (pp. 252–256). San Diego, USA.

  • Lease, M., & Charniak, E. (2005). Parsing biomedical literature. In Proceedings of the 2nd international joint conference on natural language processing (IJCNLP-05) (pp. 58–69). Jeju Island, Korea.

  • Malouf, R. (2002). A comparison of algorithms for maximum entropy parameter estimation. In Proceedings of the 6th conference on natural language learning (CoNLL-2002) (pp. 49–55). Taipei, Taiwan.

  • Malouf, R., & Van Noord, G. (2004). Wide coverage parsing with stochastic attribute value grammars. In Proceedings of the IJCNLP-04 workshop: Beyond shallow analyses—Formalisms and statistical modeling for deep analyses. Hainan, China.

  • Marcus M. P., Santorini B., Marcinkiewicz M. A. (1993) Building a large annotated corpus of English. The Penn Treebank. Computational Linguistics 19: 313–330

    Google Scholar 

  • McClosky, D., & Charniak, E. (2008). Self-training for biomedical parsing. In Proceedings of ACL-08 HLT: Short Papers (pp. 101–104). Columbus, USA.

  • McClosky, D., Charniak, E., & Johnson, M. (2006). Reranking and self-training for parser adaptation. In Proceedings of the COLING/ACL 2006 (pp. 337–344). Sydney, Australia.

  • McClosky, D., Charniak, E., & Johnson, M. (2010). Automatic domain adaptation for parsing. In Proceedings of HLT-NAACL 2010 (pp. 28–36). Los Angeles, USA.

  • Miyao, Y., Sagae, K., & Tsujii, J. (2007). Towards framework-independent evaluation of deep linguistic parsers. In Proceedings of the GEAF 2007 Workshop. Palo Alto, USA.

  • Miyao Y., Tsujii J. (2008) Feature forest models for probabilistic HPSG parsing. Computational Linguistics 34(1): 35–80

    Article  Google Scholar 

  • Oepen, S. (2001). [incr tsdb()]—Competence and performance laboratory. User Manual. Technical report, Computational Linguistics, Saarland University, Saarbrücken, Germany.

  • Oepen, S., & Carroll, J. (2000). Ambiguity packing in constraint-based parsing—Practical results. In Proceedings of NAACL 2000 (pp. 162–169). Seattle, USA.

  • Oepen S., Flickinger D., Toutanova K., Manning C. D. (2004) LinGO redwoods: A rich and dynamic treebank for HPSG. Research on Language and Computation 2(4): 575–596

    Article  Google Scholar 

  • Oepen, S., & Lønning, J. T. (2006). Discriminant-based MRS banking. In Proceedings the fifth international conference on language resources and evaluation (LREC 2006) (pp. 1250–1255). Genoa, Italy.

  • Oepen, S., Toutanova, K., Shieber, S., Manning, C., Flickinger, D., & Brants, T. (2002). The LinGO redwoods treebank: Motivation and preliminary applications. In Proceedings of the 19th international conference on computational linguistics, Vol. 2 (pp. 1–5).

  • Ohta, T., Tateisi, Y., & Kim, J.-D. (2002). The GENIA corpus: An annotated research abstract corpus in molecular biology domain. In Proceedings of the second international conference on human language technology research (pp. 82–86). San Francisco, USA.

  • Osborne, M., & Baldridge, J. (2004). Ensemble-based active learning for parse selection. In Proceedings of HLT-NAACL 2004: Main proceedings (pp. 89–96). Boston, USA.

  • Plank, B., & van Noord, G. (2008). Exploring an auxiliary distribution based approach to domain adaptation of a syntactic disambiguation model. In Proceedings of the COLING 2008 workshop on cross framework and cross domain parser evaluation. Manchester, UK.

  • Pyysalo S., Ginter F., Heimonen J., Björne J., Boberg J., Järvinen J., Salakoski T. (2007) BioInfer: A corpus for information extraction in the biomedical domain. BMC Bioinformatics 8(1): 50

    Article  Google Scholar 

  • Rayson, P., & Garside, R. (2000). Comparing corpora using frequency profiling. In The workshop on comparing corpora (pp. 1–6). Association for Computational Linguistics, Hong Kong, China.

  • Rimell L., Clark S. (2009) Porting a lexicalized-grammar parser to the biomedical domain. Biomedical Informatics 42(5): 852–865

    Article  Google Scholar 

  • Roark, B., & Bacchiani, M. (2003). Supervised and unsupervised PCFG adaptation to novel domains. In Proceedings of HLT-NAACL 2003 (pp. 126–133). Edmonton, Canada.

  • Rosén, V., Meurer, P., & Smedt, K. D. (2009). LFG Parsebanker: A toolkit for building and searching a treebank as a parsed corpus. In Proceedings of the seventh international workshop on treebanks and linguistic theories (TLT7) (pp. 127–133). LOT, Utrecht, The Netherlands.

  • Tanaka, T., Bond, F., Oepen, S., & Fujita, S. (2005). High precision Treebanking—Blazing useful trees using POS information. In Proceedings of the 43rd annual meeting of the association for computational linguistics (ACL’05) (pp. 330–337). Association for Computational Linguistics, Ann Arbor, USA.

  • Van der Beek L., Bouma G., Malouf R., Van Noord G. (2002) The Alpino dependency treebank. Computational Linguistics in the Netherlands 45(1): 8–22

    Google Scholar 

  • Velldal, E. (2007). Empirical realization ranking. Ph.D. thesis, University of Oslo Department of Informatics.

  • Verspoor K., Cohen K. B., Hunter L. (2009) The textual characteristics of traditional and open access scientific journals are similar. BMC Bioinformatics 10(1): 183

    Article  Google Scholar 

  • Yeh, A. (2000). More accurate tests for the statistical significance of result differences. In Proceedings of the 18th international conference on computational linguistics (COLING 2000) (pp. 947–953).

  • Ytrestøl, G., Flickinger, D., & Oepen, S. (2009). Extracting and annotating Wikipedia sub-domains—Towards a new eScience community resource. In Proceedings of the seventh international workshop on treebanks and linguistic theories. Groeningen, The Netherlands.

  • Zhang, Y., & Kordoni, V. (2010). Discriminant ranking for efficient treebanking. In Coling 2010: Posters (pp. 1453–1461). Coling 2010 Organizing Committee, Beijing, China.

  • Zhang, Y., Oepen, S., & Carroll, J. (2007) Efficiency in unification-based n-best parsing. In Proceedings of IWPT 2007 (pp. 48–59). Prague, Czech Republic.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrew MacKinlay.

About this article

Cite this article

MacKinlay, A., Dridan, R., Flickinger, D. et al. Cross-Domain Effects on Parse Selection for Precision Grammars. Res on Lang and Comput 8, 299–340 (2010). https://doi.org/10.1007/s11168-011-9080-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11168-011-9080-7

Keywords

Navigation