On understanding character-level models for representing morphology
Abstract
Morphology is the study of how words are composed of smaller units of meaning
(morphemes). It allows humans to create, memorize, and understand words in their
language. To process and understand human languages, we expect our computational
models to also learn morphology. Recent advances in neural network models provide
us with models that compose word representations from smaller units like word segments,
character n-grams, or characters. These so-called subword unit models do not
explicitly model morphology yet they achieve impressive performance across many
multilingual NLP tasks, especially on languages with complex morphological processes.
This thesis aims to shed light on the following questions: (1) What do subword
unit models learn about morphology? (2) Do we still need prior knowledge about
morphology? (3) How do subword unit models interact with morphological typology?
First, we systematically compare various subword unit models and study their performance
across language typologies. We show that models based on characters are
particularly effective because they learn orthographic regularities which are consistent
with morphology. To understand which aspects of morphology are not captured by
these models, we compare them with an oracle with access to explicit morphological
analysis. We show that in the case of dependency parsing, character-level models
are still poor in representing words with ambiguous analyses. We then demonstrate
how explicit modeling of morphology is helpful in such cases. Finally, we study how
character-level models perform in low resource, cross-lingual NLP scenarios, whether
they can facilitate cross-linguistic transfer of morphology across related languages.
While we show that cross-lingual character-level models can improve low-resource
NLP performance, our analysis suggests that it is mostly because of the structural
similarities between languages and we do not yet find any strong evidence of crosslinguistic
transfer of morphology. This thesis presents a careful, in-depth study and
analyses of character-level models and their relation to morphology, providing insights
and future research directions on building morphologically-aware computational NLP
models.