thesis_access.pdf (708.9 kB)
Download file

A Corpus-Based Investigation of Idiomatic Multiword Units

Download (708.9 kB)
posted on 2021-11-04, 19:50 authored by Grant, Lynn E

Idioms - a type of multiword unit (MWU) - are defined as being non-compositional and in general cannot be understood by adding together the meanings of the individual words that comprise the MWU. Because of this, they present a particular challenge to students who speak English as a second- or foreign-language (ESL/EFL). As a teacher of second-language (L2) learners, it is just that challenge which has motivated this study. Specifically, there were two main aims of the thesis. In order to know how to teach idioms to ESL/EFL learners, we - as language teachers - need to know how to define and explain them. Therefore, the first aim of the study was to either find an English (L1) definition of an idiom which could clearly distinguish one type from another, and an idiom from a non-idiom, or to develop a new definition. Having not found such a definition, a new definition was put forward, dividing MWUs presently known as idioms into three new groups - core idioms, figuratives, and ONCEs (one noncompositional element). The L1 perspective was adopted for the definition as an L2 perspective would involve considerably more variables. The second aim was to develop a comprehensive list of one of the three new groups - core idioms - and then try to establish frequency, using a corpus search. A number of steps were taken to compile this list, involving an examination of several sources of written and spoken English. The result was that when the criteria established to define a core idiom - being both non-compositional and non-figurative - were strictly applied to the large collection of MWUs presently known as 'idioms', the figure was reduced to only 104 MWUs deemed to be either core idioms or 'borderline figuratives' and 'borderline ONCEs'. Next the British National Corpus (BNC), a corpus of 100 million words, was searched for occurrences of these 104 core idioms and borderlines to establish their frequency. The result of the corpus search showed that none of the core idioms occurs frequently enough to get into the most frequent 5,000 words of English. However, as the motivation to do the study was the desire to find a better way to teach idiomatic MWUs, a brief discussion followed with suggestions for the teaching and learning of these idiomatic MWUs. Finally, some methodological implications and suggestions for future research were put forward, looking at further research which would advance the field of second-language acquisition (SLA) related to the learning of idiomatic MWUs.


Copyright Date


Date of Award



Te Herenga Waka—Victoria University of Wellington

Rights License

Author Retains Copyright

Degree Discipline

Applied Linguistics

Degree Grantor

Te Herenga Waka—Victoria University of Wellington

Degree Level


Degree Name

Doctor of Philosophy

Victoria University of Wellington Item Type

Awarded Doctoral Thesis



Victoria University of Wellington School

School of Linguistics and Applied Language Studies


Bauer, Laurie; Nation, Paul