In the Introduction to his book Lexical Grammar, Selivan explains what lexical chunks are, why they’re so important, and how they should be used in ELT. I will argue that the explanation is confused and simplistic; that the works by scholars advocating a usage-based theory of language learning are misrepresented and misinterpreted; that Selivan fails to give any coherent account of how chunks are stored, or what part they play in learning an L2; and that he fails to make the case for using chunks “to drive grammar acquisition”.
What is a chunk?
Selivan says “A chunk is a group of words customarily found together”. He gives these examples:
- fixed expressions, e.g., as a matter of fact,
- combinations of words that allow variation, such as see you later/soon/tomorrow,
- collocations, such as pursue a career; a scenic route; a chance encounter,
- stems that can be used to build various sentences in English, such as If I were you; It’s been a while since; It took me a long time to
- full sentences, such as It’s none of your business; There’s no doubt about it; What are you gonna do?
Given the wide spread covered by chunks, Selivan asks:
Is everything chunks, then? He answers:
Yes, to a large extent. Evidence suggests that our mental lexicon does not consist of individual words but chunks. Chunks …. are stored in the brain as single units. Research shows that about 50–80% of native-speaker discourse consists of recurring multi- word combinations (Altenberg, 1987; Erman and Warren, 2000).
We may quickly note that everything is not chunks, and the claim made for the research on recurring multi- word combinations is exaggerated. Selivan goes on to say that chunks “blur the boundary between vocabulary and grammar” because of “the tendency of certain words to occur with certain grammatical structures and vice versa”. True enough.
In the next section, Selivan asks “Is there more to knowing a language than just reproducing chunks we have encountered?”, and replies by giving a summary of Hoey’s “new theory of language acquisition”.
Hoey’s Lexical Priming
Here is Selivan’s summary:
Hoey (2005) argues that as we acquire new words we take a subconscious note of words that occur alongside (collocation) and of any associated grammatical patterns (colligation). Through multiple encounters with a new word, we become primed to associate it with these recurring elements. According to Hoey’s theory, our brain is like a giant corpus where each word is accompanied by mental usage notes. Language production is not a matter of simply combining words and rules but rather a retrieval of the language we are primed for, i.e. the patterns and combinations we have previously seen or heard. …. The theory explains why, when producing language, our first port of call is our mental store of pre-fabricated chunks. However, this does not completely negate the role of generative grammar. Knowledge of grammar rules is still important to fine-tune chunks so that they fit new contexts. Because we are only primed to repeat language we have encountered in particular contexts, if we find ourselves in a new communicative situation, we might not have any ready-made language to draw on. This is when grammar knowledge can help us produce completely new sentences.
This is actually a very poor account of Hoey’s theory, and there are several things wrong with it, but let’s focus on storing chunks. According to Selivan, Hoey says that our brain is like a giant corpus where each word is accompanied by mental usage notes. But how does that fit with the claim that our brain stores chunks as whole single units? As Tremblay, Derwing and Libben (2007) point out, ‘stored’ could mean one of two things. The words making up the chunk could be seen as individual items which are linked together through knowing that they go together. So the chunk in the middle of would be [in -> the -> middle-> of]. On the other hand, ‘stored’ could mean that the chunk had no internal structure and would look something like [inthemiddleof]. Selliven repeatedly says that chunks are stored as single holistic units, but if so, then these units are indivisible chunks with no internal parts – they’re not linked together through knowing that they go together – and therefore they can’t be “teased apart”, or used as templates, or used to drive the process of grammar acquisition .We’ll return to this is a minute.
The confusion mounts when Selivan goes on to talk about “the role of generative grammar” (sic) in allowing learners to “fine-tune chunks so that they fit new contexts”. Selivan suggests that we are only primed to repeat language which we have encountered in particular contexts, and that consequently “if we find ourselves in a new communicative situation, we might not have any ready-made language to draw on”. He goes on: “this is when grammar knowledge can help us produce completely new sentences”. Not only does this show a complete failure to understand Hoey’s theory, it also paints an unlikely picture of L2 learners’ dichomotomous behaviour, suggesting that when they’re in “familiar contexts” they repeat language they’ve already encountered, whereas when they’re confronted with “new communicative situations”, they must resort to grammar knowledge in order to produce completely new sentences.
Chunks in Language Acquisition
According to Selivan, chunks allow learners to produce language such as I haven’t seen you for ages “when their own grammatical competence doesn’t yet allow them to generate new sentences in the present perfect”. But they do more – much more – than that: memorised chunks also “drive the process of grammar acquisition”. The argument is (and I’m re-arranging the original text a bit) that children acquiring their L1 start out by recording pieces of language encountered during their day-to-day interaction and then repeating words (e.g. dog) or multi-word phrases (e.g. Let me do it, Where’s the ball?). They then slightly modify the encountered language to suit various communicative needs:
- Where’s the ball?
- Where’s the dog?
- Where’s Daddy?
Only later, says Selivan, do abstract categories and schemas, such as the subject–verb–object word order or inversion in interrogatives, begin to form “from these specific instances of language use”.
But this is not how children learn their first language. O’Grady (2005) explains how children use a collection of abilities to learn language. They begin by distinguishing speech sounds from other types of sounds and from each other. They then use the ability to produce speech sounds in an intelligible manner, stringing them together to form words and sentences. For words, there is first of all the ability to pick the building blocks of language out of the speech stream by noticing recurring stress patterns (like the strong–weak pattern of English) and which combinations of consonants are most likely to occur at word boundaries. For meaning, they have the ability to “fast map” – to learn the meaning of a word on the basis of a single exposure to its use, using linguistic clues to infer (for instance) that a zav must be a thing, but that Zav has to be a person. For sentences, there’s the ability to note patterns of particular types (subject–verb–object constructions, passives, negatives, relative clauses), to see how they are built, and to figure out what they are used for.
Most relvant to Selivan’s central claim for chunks is O’Grady’s account of the beginning of an infant’s language learning. Right at the start, children pick what they can out of the stream of speech that flows past their ears. They often pick out single words, but sometimes they get larger bites of speech – like what’s that? (pronounced whadat) or give me (pronounced gimme). O’Grady says there’s a simple test to decide whether a particular utterance should be thought of as a multi-word sentence or an indivisible chunk with no internal parts: if there are multiple words and the child knows it, they should show up elsewhere in his speech – either on their own or in other combinations. That’s what happens in adult speech, where the three words in What’s that? can each be used in other sentences as well.
But in child language, what’s that is almost certainly an indivisible chunk. There’s no indication that the different parts of an utterance have an independent existence of their own, and there’s no evidence that they get “slightly modified to suit various communicative needs” in the way Selivan suggests.
O’Grady suggests that children have two different styles of language learning.
1. The analytic style breaks speech down into its smallest component parts, and short, clearly articulated, one-word utterances characterise the early stages of language learning. They like to name people (Daddy, Mummy) and objects (Kitty, car) and they use simple words like up, hot, hungry to describe how they feel and what they want.
2. Other children take a different approach. They memorize and produce relatively large chunks of speech (often poorly articulated) that correspond to entire sequences of words in the adult language. Whasdat?, dunno, donwanna, gimmedat, lookadat. This is called the gestalt style of language learning.
No child employs a completely analytic strategy or a purely gestalt style. Rather, children exhibit tendencies in one direction or the other. Whatever direction they tend towards, all children eventually become competent language users, and to suggest, as Selivan does, that this process can be described – and even explained – by saying that they unpack chunks that are stored as holistic units in the brain is not just absurdly simplistic, it’s also so confused as to be preposterous.
Selivan argues that the SLA process is very similar to L1 learning: L2 learners use memorized chunks to drive “the process of grammar acquisition” by “extrapolating grammar rules” from them. Selliven cites SLA studies which show that new grammatical structures are often learned initially as unanalysed wholes and later on broken down for analysis. For example, learners may learn the going to future form as a chunk, such as I am going to write about for writing essays (Bardovi-Harlig, 2002), before adapting the structure to include other verbs: I am going to take/try/make, etc. Holistically stored chunks gradually evolve into more productive patterns as learners tease them apart and use them as templates to create new sentences:
- I haven’t seen you for ages.
- I haven’t seen her for ages.
- I haven’t seen him since high school
- I haven’t heard from her for ages.
Here we go again! Holistically stored chunks by definition can’t gradually evolve into more productive patterns. While there’s every reason to think that L2 learners unpack chunks, Selivan fails to cite the relevant literature, and fails to situate the process of unpacking, analysing and re-packing certain types of chunks in any coherent theory of SLA.
Things get worse. Selivan continues his discussion by raising the question:
Why is it that while children effortlessly acquire their mother tongue from examples using their pattern-finding ability, the process of L2 acquisition is often so laborious, with many learners never reaching native-like performance?”
One of the main reasons, says Selivan, is lack of exposure – L1 proficiency is the result of thousands of hours of exposure to rich language input, while the exposure L2 learners receive is often not suficient to enable them to identify patterns from specific examples. But even when there is plenty of input, Selivan admits that there are additional factors which may hinder the process of L2 acquisition. He focuses on salience, the lack of which, he says, may explain why certain grammatical forms are notoriously difficult for learners to acquire. Selivan points out that many grammatical cues in English (for example tense marking, the third person singular -s and articles) are not salient. And grammatical words tend to be unstressed in English, making them more difficult to perceive aurally. We stress know in I don’t know, not don’t, which results in something sounding like I dunno in spoken English. We stress taken in You should have taken an umbrella, which is reduced to You should’ve taken an umbrella, or even You shoulda taken an umbrella.
There are a number of problems with this account. First, it relies on a usage-based theory of language acquisition, which is not accepted by the majority of scholars working in the field of SLA. Selivan should at least respond to crtiics of his preferred theory, which is still in its infancy, does not accurately describe langauge learning, and does not explain how children acquire linguistic competence. I’ll just mention a few points from Gregg’s 2003 article on emergentism:
- Usage-based theories don’t tell us anything about children’s linguistic knowledge which comes about in the absence of exposure (i.e., a frequency of zero), including knowledge of what is not possible.
- While N. Ellis aptly points to infants’ ability to do statistical analyses of syllable frequency, he fails to acknowledge that those infants didn’t learn that ability. How do young children uniformly manage this task: why do they focus on syllable frequency (instead of some other information available in exposure), and how do they know what a syllable is in the first place, given crosslinguistic variation?
- How does usage-based theory account for studies showing early grammatical knowledge, in cases where input frequency could not possibly be appealed to?
- Regarding infant L1 learning, claims by Ellis and others that “learners need to have processed sufficient exemplars…” are either outright false, or else true only vacuously (if “sufficient” is taken to range from as low a figure as 1).
- “It is precisely because grammar rules have a deductive structure that one can have instantaneous learning, without the trial and error involved in connectionist learning. With the English past tense rule, one can instantly determine the past tense form of “zoop” without any prior experience of that verb, let alone of “zooped”…. If all we know is that John zoops wugs, then we know instantaneously that John zoops, that he might have zooped yesterday and may zoop tomorrow, that he is a wug-zooper who engages in wug-zooping, that whereas John zoops, two wug-zoopers zoop, that if he’s a Canadian wug-zooper he’s either a Canadian or a zooper of Canadian wugs (or both), etc. We know all this without learning it, without even knowing what “wug” and “zoop” mean.” (Gregg, 2003, p. 111).
Second, as already stated, Selivan doesn’t explain how “holistically stored chunks” can “evolve into more productive patterns as learners tease them apart and use them as templates to create new sentences”. In order to be used in this way, the chunks need to be better defined, and the way in which they’re stored and retrieved has to be properly explained.
Third, Selivan fails to grasp what usage-based theory has to say about associative learning or about differences between L1 and L2 learning. His discussion of salience is completely out of place in a simplistic model which sees language learning as a process where you start by memorising chunks, then, when the occasion demands, tease them apart and use them as templates to create new sentences, and thus learn grammatical rules. Such an account doesn’t accurately describe any usage-based theory of learning, and it doesn’t explain why salience is a problem for L2 learners of English but not for infant L1 learners of English.
In brief, Selivan misrepresents and misinterprets Hoey, Tomasello, and Ellis; he makes no attempt to address the criticisms made of usage-based theories; he fails to explain the enormous disparities between the results of L1 and L2 learning; and he fails to give any coherent account of what chunks are, or what part they play in learning an L2. There is little to recommend it as an explanation of how people learn an L2.
Chunks in Language Teaching
In the final section, Selivan looks at chunks in language teaching. He argues that “the learning of new structures” should start off as gradual exposure to and accumulation of chunks containing the target structures. As the number of stored chunks grows, chunks exhibiting the same pattern will gradually feed into the grammar system. This is when grammatical competence with a particular structure begins to emerge. To speed up the process of chunk accumulation and pattern detection, chunks need to be taught explicitly. Here are some bits of the advice offered:
- Learners’ attention should be drawn to chunks containing certain grammatical structures. They can practise and learn the chunks lexically before moving on to any kind of grammar explanation, i.e. they should be encouraged to memorize before they analyse.
- Many classroom activities should focus on highlighting chunks in reading and listening input. Such receptive, awareness-raising activities can be gradually combined with more productive ones, where learners manipulate the chunks to fit different communicative situations and scenarios.
- Learners should be eased into new grammar areas through chunks. For example, Have you ever been to can be presented in the context of travel or holidays, without delving into a grammatical analysis of the present perfect. Similarly, Have you seen can be presented when discussing films in class. Start by getting learners to practise and memorize chunks containing a new grammatical structure, resisting the temptation to move too quickly into any grammar explanation.
- Getting learners to produce new language is an essential pedagogical activity. Using new grammatical structures, however partially or provisionally understood, promotes fluency and acquisition of these structures. It also allows learners to produce language which is structurally beyond their present level of competence. It is, therefore, the teacher’s role to encourage learners to incorporate new structures in their output and ‘push’ them beyond their comfort zone.
Let’s just pause here and look at that last one. Selivan suggests that teachers should get learners to produce new language which is structurally beyond their present level of competence by using new grammatical structures which they only partially understand. Does producing memorised chunks that have been stored in the brain as single units count as producing new language? If not, how are teachers to get learners to do it?
Rather than examine Selivan’s methodology in any detail, it’s enough to note that he thinks teachers of English as an L2 should
- continue to use coursebooks,
- continue to use a PPP methodology to present and then practice a sequence of formal elements of the language,
- get students to memorise chunks and then use those chunks as a way of easing into grammar teaching.
His version of ELT is thus subject to the same criticisms made of other types of synthetic syllabuses which are implemented using a PPP methodology. Still, there’s one particular problem that has to be faced, and that is, of course: Which chunks should be presented for memorisation and further work, and in what order? Given the impossibility of getting students to memorise the tens of thousands of chunks needed for fluent communication, how do you select and sequence the necessarily small fraction of chunks that will drive any particular course of ELT? Selivan doesn’t answer the question, which is hardly surprising, because there is no answer. If you select some chunks and then teach them in the way Selivan suggests, even if students actually learn them all, you won’t cover enough to get anywhere near the number known by competent English users. Now doesn’t that suggest that there’s some fatal flaw in the whole project, and that there are better ways of helping students to develop communicative competence?
Gregg, K. (2003) The state of emergentism in second language acquisition. Second Language Research, 19, 2.
Hoey, M. (2005) Lexical Priming: A New Theory of Words and Language. London: Routledge.
O’Grady, W. (2008) How Children Learn Language. Cambridge, CUP.
Selivan, L. (2018) Lexical Grammar. Cambridge, CUP.
Tremblay, Derwing and Libben (2007) Are lexical bundles stored and processed as single units? Proceedings of 23rd Linguistics Conference, Victoria, BC, Canada.