Elsevier

Acta Psychologica

Volume 150, July 2014, Pages 80-84
Acta Psychologica

Norms of age of acquisition and concreteness for 30,000 Dutch words

https://doi.org/10.1016/j.actpsy.2014.04.010Get rights and content

Highlights

  • We collected age-of-acquisition ratings for 30,000 Dutch words.

  • We validated these ratings against existing, smaller databases.

  • These ratings can be used in word processing experiments.

Abstract

Word processing studies increasingly make use of regression analyses based on large numbers of stimuli (the so-called megastudy approach) rather than experimental designs based on small factorial designs. This requires the availability of word features for many words. Following similar studies in English, we present and validate ratings of age of acquisition and concreteness for 30,000 Dutch words. These include nearly all lemmas language researchers are likely to be interested in. The ratings are freely available for research purposes.

Introduction

Research on word recognition is rapidly changing. Authors realise that the traditional small-scale factorial experiments are not the best approach because they lack power (Keuleers, Diependaele, & Brysbaert, 2010), do not give information about the full range of variables (Kuperman, Estes, Brysbaert, & Warriner, in press), and are open to experimenter bias in stimulus selection (Forster, 2000, Kuperman, 2014). A better approach is to treat word recognition studies not as experiments in which word features can be manipulated but as correlational studies in which covariations between word features and word processing performance can be assessed (Baayen et al., 2006, Balota et al., 2004, Lewis and Vladeanu, 2006). As a result, researchers have collected word processing times for thousands of words in so-called lexicon projects. Thus far, this happened in American English (Balota et al., 2007), Dutch (Keuleers et al., 2010), Malay (Yap, Rickard Liow, Jalil, & Faizal, 2010), French (Ferrand et al., 2010), British English (Keuleers, Lacey, Rastle, & Brysbaert, 2012), and Chinese (Sze, Rickard Liow, & Yap, in press).

At the same time, an optimal use of the lexicon projects requires information about the word features for (ideally) the entire database. This is easy for word variables that can be calculated on the basis of the words themselves or corpus analyses, such as word length, various measures of word frequency, and similarity to other words but requires a major investment for variables that are based on subjective ratings.1 These are variables like age of acquisition, concreteness, imageability, familiarity, valence, and arousal. They are investigated for their own sake or must be controlled for in order not to confound the effect of the variable of interest.

The situation is rapidly improving for the English language, where age-of-acquisition ratings have been collected for 30,000 words (Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012), affective ratings for 14,000 words (Warriner, Kuperman, & Brysbaert, 2013), and concreteness ratings for 40,000 words (Brysbaert, Warriner, & Kuperman, in press). The main reason for this improvement is that in English, one can make use of Amazon Mechanical Turk, a service created by the company Amazon where Internet users can earn a small amount of money by doing so-called Human Intelligence Tasks. These are usually short rating or translation tasks. Because there are several tens of thousands of Mechanical Turk workers, large-scale rating studies can be done in a matter of weeks at an affordable price. In addition, if some basic controls are included, the ratings are as reliable and valid as those collected under traditional laboratory circumstances (for evidence, see the references above).

The situation is different for other languages because Amazon Mechanical Turk is based in the United States and has much fewer users/workers in languages other than English or Spanish (also the payment happens in dollars via the American branch of the company). This means that Amazon Mechanical Turk is a less interesting tool to collect data for languages such as Dutch. However, Moors et al. (2013) recently proposed an alternative solution. They showed that asking a limited group of participants to rate a list of 4,300 words returns the same outcome as the traditional approach of asking a large number of participants to rate 300 words each. The costs for paying the participants are the same, but the logistics become much more feasible. Also, participants are more interested and motivated when they can earn more money (because of the larger time investment).2

Arguably, the two most important word norms based on subjective ratings are age of acquisition (AoA) and concreteness. AoA refers to the age at which a word has been acquired and explains some 5% of variance in lexical decision times after the effects of word frequency, word length, and similarity to other words have been partialed out (Kuperman et al., 2012). This is even more when a suboptimal word frequency measure is used (Brysbaert & Cortese, 2011). The impact of AoA is due to the fact that the order of acquisition is an important variable in the organisation of the mental lexicon and the semantic system (Bai et al., 2013, Catling et al., 2013, Cortese and Schock, 2013, Cuetos et al., 2010, Palmer and Havelka, 2010) and to the fact that AoA is an important proxy for estimating the cumulative frequency with which people have come across words in their life (Lete & Bonin, 2013).

Concreteness evaluates the degree to which a concept denoted by a word refers to a perceptible entity. It is an important variable in memory research ever since Paivio formulated his dual-coding theory (Paivio, 1971, Paivio, 2013). According to this theory, concrete words are easier to remember than abstract words because they activate perceptual memory codes in addition to verbal codes. The variable gained extra interest within the embodied view of cognition (Barsalou, 1999, Fischer and Zwaan, 2008, Wilson, 2002), certainly after it was established that words referring to easily perceptible entities co-activate the brain regions involved in the perception of those entities, and that action-related words co-activate the motor cortex involved in executing the actions. On the basis of these findings, Vigliocco, Vinson, Lewis, and Garrett (2004) (see also Andrews, Vigliocco, & Vinson, 2009) presented a semantic theory, according to which the meaning of concepts depends on experiential and language-based connotations to different degrees. Some words are mainly learned on the basis of direct experiences; others are mostly used in text and discourse.

Concreteness is also much researched in psycholinguistics. These are a few examples of recently examined topics related to concreteness. Are there hemispheric differences in the processing of concrete and abstract words (Oliveira, Perea, Ladera, & Gamito, 2013)? Does concreteness affect bilingual and monolingual word processing (Barber et al., 2013, Connell and Lynott, 2012, Gianico-Relyea and Altarriba, 2012, Kaushanskaya and Rechtzigel, 2012)? Do concrete and abstract words differ in affective connotation (Ferré et al., 2012, Kousta et al., 2011)? Do neuropsychological patients differ in the comprehension of concrete and abstract words (Loiselle et al., 2012)?

Imageability and familiarity are less interesting variables because imageability is highly correlated with concreteness and seems to stress the visual modality too much (Connell & Lynott, 2012). The importance of familiarity is likely to be minimal, once one has a good word frequency measure and information about AoA (Brysbaert & Cortese, 2011). Valence and arousal have recently gained interest (e.g., Kuperman et al., in press) but could not be included in the present study (see, however, Moors et al., 2013, who collected values for 4,300 words).

AoA ratings were available for a few thousand words in Dutch. Ghyselinck, De Moor, and Brysbaert (2000) collected norms for some 3,000 short words. Ghyselinck, Custers, and Brysbaert (2003) collected ratings for a further 2,300 words from much used semantic categories (such as clothes, animals, utensils, birds, etc.). Finally, Moors et al. (2013) collected ratings for 4,300 words. To our knowledge, there are no large collections of concreteness ratings, but imageability norms were collected by Van Loon-Vervoorn (1985) for 6,100 words. The correlations between concreteness and imageability reported in the literature range from 0.78 to 0.85 (Friendly et al., 1982, Gilhooly and Logie, 1980, Paivio et al., 1968).

Below, we describe the collection of concreteness and AoA ratings for 30,000 Dutch words.

Section snippets

Stimulus materials

On the basis of dictionaries and corpus analyses, we selected a list of 30,000 ‘interesting’ words. Interesting was defined in terms of the following:

  • 1.

    Words are lemmas (unless the inflected form is highly frequent; e.g., ‘eyes’ in addition to ‘eye’).

  • 2.

    No proper nouns are used.

  • 3.

    The words are likely to be known to the participants.

  • 4.

    No long, transparent compound words are included. Dutch is a language in which compounds do not have spaces, meaning that hundreds of thousands of words can be made by

Results and discussion

No students had to be excluded because of bad data. The intraclass correlation coefficient of reliability for the concreteness ratings was 0.92 (confidence interval 0.91–0.93; there were no noteworthy differences between the lists). To check the validity of the ratings, we correlated them with the imageability ratings collected by Van Loon-Vervoorn (1985). There was an overlap of 5,683 words between both lists. The correlation amounted to 0.76, very similar to the values reported for the

Availability

Two supplemental Excel files contain all the information discussed in the present study about AoA and concreteness ratings for 30,000 Dutch words. In addition, there is a third file summarising all information collected on AoA ratings in Dutch thus far. These are the AoA ratings from Ghyselinck et al., 2000, Ghyselinck et al., 2003, Moors et al. (2013), and the present study. While combining this information, we noticed that the ratings of Ghyselinck et al., 2000, Ghyselinck et al., 2003 not

Acknowledgement

This research was made possible by the grant ‘Wetenschappelijke Onderzoeksgemeenschap Taalverwerking’ from the FWO Flanders.

References (46)

  • D.A. Balota et al.

    Visual word recognition for single syllable words

    Journal of Experimental Psychology: General

    (2004)
  • D.A. Balota et al.

    The English Lexicon Project

    Behavior Research Methods

    (2007)
  • L.W. Barsalou

    Perceptual symbol systems

    Behavioral and Brain Sciences

    (1999)
  • M. Brysbaert et al.

    Do the effects of subjective frequency and age of acquisition survive better word frequency norms?

    Quarterly Journal of Experimental Psychology

    (2011)
  • M. Brysbaert et al.

    Concreteness ratings for 40 thousand generally known English word lemmas

    Behavior Research Methods

    (2014)
  • J. Catling et al.

    Age-of-acquisition effects in novel picture naming: A laboratory analogue

    Quarterly Journal of Experimental Psychology

    (2013)
  • M.J. Cortese et al.

    Imageability and age of acquisition effects in disyllabic word recognition

    Quarterly Journal of Experimental Psychology

    (2013)
  • L. Ferrand et al.

    The French Lexicon Project: Lexical decision data for 38,840 French words and 38,840 pseudowords

    Behavior Research Methods

    (2010)
  • P. Ferré et al.

    Affective norms for 380 Spanish words belonging to three different semantic categories

    Behavior Research Methods

    (2012)
  • M.H. Fischer et al.

    Embodied language: A review of the role of the motor system in language comprehension

    The Quarterly Journal of Experimental Psychology

    (2008)
  • K.I. Forster

    The potential for experimenter bias effects in word recognition experiments

    Memory & Cognition

    (2000)
  • M. Friendly et al.

    The Toronto word pool: Norms for imagery, concreteness, orthographic variables, and grammatical usage for 1,080 words

    Behavior Research Methods & Instrumentation

    (1982)
  • M. Ghyselinck et al.

    Age-of-acquisition rations for 2332 Dutch words from 49 different semantic categories

    Psychologica Belgica

    (2003)
  • Cited by (148)

    • Spoken verb learning in children with language disorder

      2024, Journal of Experimental Child Psychology
    View all citing articles on Scopus
    View full text