In a recent blog post I said:

“UB theories are increasingly fashionable, but I’m still not impressed with Construction Grammar, or with the claim that language learning can be explained by appeal to noticing regularities in the input”.

Scott Thornbury replied:

“Read Taylor’s The Mental Corpus (2012) and then say that!”.  

Well I’ve had a second go at reading Taylor’s book, and here’s my reply, based partly on a review by Pieter Seuren

Taylor’s thesis is that knowledge of a language can be conceptualized in terms of the metaphor of a “mental corpus”. Language knowledge is not knowledge of grammar rules, but rather  “accumulated memories of previously encountered utterances and the generalizations which arise from them.” Everybody remembers everything in the language that they have ever heard or read. That’s to say, they remember everything they’ve ever encountered linguistically, including phonetics, the context of utterances, and precise semantic, morphological and syntactic form. Readers may well think that this has much in common with Hoey’s (2006) theory, and that’s not where the similarities end: like Hoey, Taylor offers no explanaton of how people draw on this “literally fabulous” memory. Taylor says nothing about the  formula of analysis in memory; nothing about the internal structure of that memory; nothing about how speakers actually draw on it; nothing about the kind of memory involved; “in short, nothing at all”.

Seuren argues that while there’s no doubt that speakers often fall back on chunks drawn holistically from memory, they also use rules. Thus, criticism of Chomsky’s UG is no argument against the viability of any research programme involving the notion of rules.

Taylor never considers the possibility of different models of algorithmically organized grammar. One obvious possibility is a grammar that converts semantic representations into well-formed surface structures, as was proposed during the 1970s in Generative Semantics. One specific instantiation of such a model is proposed in my book Semantic Syntax (Blackwell, Oxford 1996), … This model is totally non-Chomskyan, yet algorithmic and thus rule-based and completely formalized. But Taylor does not even consider the possibility of such a model.

Without endorsing Seuran’s model of grammar, or indeed his view of language, I think he makes a good point. He concludes

Apart from the inherent incoherence of Taylor’s ‘language-as-corpus’ view, the book’s main fault is a conspicuous absence of actual arguments: it’s all rhetoric, easily shown up to be empty when one applies ordinary standards of sound reasoning. In this respect, it represents a retrograde development in linguistics, after the enormous methodological and empirical gains of the past half century.

In a tweet, Scott Thornbury points to Martin Hilbert’s (2014) more favourable review of Taylor’s book, but neither he nor I have a copy of it, so we’ll have to wait while Scott gets hold of it.

Meanwhile, let’s return to the usage based (UB) theory claim that language learning can be explained by appeal to noticing regularities in the input, and that Construction Grammar is a good way of describing the regularities that are noticed in this way.

Dellar & Walkley, and Selivan, the chief proponents of “The Lexical Approach”, can hardly claim to be the brains of the UB outfit, since they all misrepresent UB theory to the point of travisty. But there are, of course, better attempts to describe and explain UB theory, most noteably by Nick Ellis (see, for example Ellis, 2019). Language can be described in terms of constructions (see Wuff & Ellis, 2018), and language acquisition can be explained by simple learning mechanisms, which boil down to detecting patterns in input: when exposed to language input, learners notice frequencies and discover language patterns. As Gregg (2003) points out, this amounts to the claim that language learning is associative learning.

When Ellis, for instance, speaks of ‘learners’ lifetime analysis of the distributional characteristics of the input’ or the ‘piecemeal learning of thousands of constructions and the frequency-biased abstraction  of regularities’, he’s talking of association in the standard empiricist sense.

Here we have to pause and look at empiricism, and its counterpart rationalism.

Empiricists claim that sense experience is the ultimate source of all our concepts and knowledge. There’s no such thing as “mind”; we’re born with a “tabula rasa”, our brain an empty vessel that gets filled with our experiences and so our knowledge is a posteriori, dependent wholly upon our history of sense experience. Skinner’s version of Behaviourism serves as a model. Language learning, like all learning, is a matter of associating one thing with another, with habit formation.      

Compare this to Chomsky’s view that what language learners get from their experiences of the world can’t explain their knowledge of their language: a better explanation is that learners have an innate knowledge of a universal grammar which captures the common deep structure of all natural languages. A set of innate capacities or dispositions enables and determines their language development. In my opinion, there’s no need to go back to the historical debate between rationalists like Descartes and empiricists like Locke: indeed, I think that these comparisons are often misleading, usually because those that use them to argue for UB theories give a very distorted description of Descartes, and fail to appreciate the full implications of adopting an empiricist view. What’s important is that the empiricism adopted by Nick Ellis, Tomasello and others today is a less strict version than the original, – talk of mind and reason is not proscibed, although for them, the simpler the mechanisms employed to explain learning, the better.

Chomsky is the main target; motivated by the desire to get rid of any “black box” and any appeal to inference to the best explanation when confronted by arguments about the poverty of the stimulus, the UB theorists appeal to frequency, Zipfian distribution, power laws, and other flimsy bits and pieces in order to replace the view that language competence is knowledge of a language system which enables speakers to produce and understand an infinite number of sentences in their language, and to distinguish grammatical sentences from ungrammatical sentences; while language learning goes on in the mind, equipped with a special language learning module to help interpret the stream of input from the environment. Such a view led to theories of SLA which see the L2 learning process as crucially a psycholinguistic process involving the development of an interlanguage, where L2 learners gradually approximate to the way native speakers use the target language.

We return to Nick Ellis’s view. Language is a collection of utterances whose regularities are explained by Construction Grammar, and language learning is based on associative learning – the frequency-biased abstraction of regularities. I’ve already expressed the view that Construction Grammmar seems to me little more than a difficult to grasp taxonomy, an a posteriori attempt to classify bits of attested language use collected from corpora; while the explanation of how we learn this grammar relies on associative learning processes which do nothing to adequately explain SLA or what children know about their L1. Here’s a bit more, based on the work of Kevin Gregg, whose view of theory construction in SLA is more eloquently stated and more carefully argued than that of any scholar I’ve ever read.

N. Ellis claims that language emerges from relatively simple developmental processes being exposed to a massive and complex environment. Gregg (2003) uses the example of the concept SUBJECT to challenge Ellis’ claim.

 The concept enters into various causal relations that determine various outcomes in various  languages: the form of relative clauses in English, the  assignment  of reference in Japanese anaphoric  sentences, agreement  markers on verbs, the existence of expletives in  some languages, the form of the verb in others, the possibility of certain null arguments in still others and so on.

Ellis claims that the concept SUBJECT emerges; it’s the effect of environmental influences that act by forming associations in the speaker’s mind such that the speaker comes to have the relevant concepts as specified by the linguist’s description. But how can the environment provide the necessary information, in all languages, for all learners to acquire the concept?  What sort of environmental information could be instructive in the right ways, and how does this information act associatively?

Gregg comments:

Frankly, I do not think the emergentist has the ghost of a chance of showing this, but what I think hardly matters. The point is that so far as I can tell, no emergentist has tried. Please note that connectionist simulations, even if they were successful in generalizing beyond their training sets, are beside the point here. It is not enough to show that a connectionist model could learn such and such: In order to underwrite an emergentist claim about language learning, it has to be shown that the model uses information analogous to information that is to be found in the actual environment of a human learner. Emergentists have been slow, to say the least, to meet this challenge.

Amen to that.

So, the choice is yours. If you choose to accept Dellar’s account of language and language learning, then you base your teaching on the worst “principles” of language and language learning in print. If you choose to follow Nick Ellis’ account, then you’ll probably have to pass on trying to figure out Construction Grammar, or explaining not just what children know about their L1 with zero input from the environment, but also how associative learning explains adult L2 learning trajectories as reported in hundreds of studies over the last 50 years. If you choose to accept one or another cognitive, psycholinguistic theory of SLA which sees L2 learning as a process of deleoping interlanguages, then you are left with the problem of providing what Gregg refers to as the property theory of SLA – In what does the capacity to use an L2 consist?; What are the properties of the language which is learned in this way? Chomsky’s explanation of language and language learning might well be wrong, but it’s still the best description of language competence on offer, (language, quite simply, is not exclusively a tool for social interaction), and it’s still the best explanation of what children know about language and how they come to know it.  


Ellis, N. (2019) Essentials of a Theory of Language Cognition. The Modern Language Journal, 103 (Supplement 2019).

Seuren, P. (1996) Semantic Syntax. Oxford: Blackwell.

Wuff. S. and Ellis, N.  (2018) Usage-based approaches  to second language acquisition. Downloadable here: https://www.researchgate.net/publication/322779469_Usage-based_approaches_to_second_language_acquisition

A Review of “Teaching Lexically” (2016) by H. Dellar and A. Walkley

(Note: I’ve moved this post from its old place in my “stuff” to here because the old blog is getting increasingly difficult to access and to edit.)

Teaching Lexically is divided into three sections.

Part A. 

We begin with The 6 principles of how people learn languages:

“Essentially, to learn any given item of language, people need to carry out the following stages:

  • Understand the meaning of the item.
  • Hear/see an example of the item in context.
  • Approximate the sounds of the item.
  • Pay attention to the item and notice its features.
  • Do something with the item – use it in some way.
  • Repeat these steps over time, when encountering the item again in other contexts” 

These “principles” are repeated in slightly amplified form at the end of Part A, and they inform the “sets of principles” for each of the chapters in Part B.

Next, we are told about Principles of why people learn languages

These “principles” are taken en bloc from the Common European Framework of Reference for languages.  The authors argue that teachers should recognise that

“for what is probably the majority of learners, class time is basically all they may have spare for language study. [This] … “emphasises how vital it is that what happens in class meets the main linguistic wants and needs of learners, chiefly:

  • To be able to do things with their language.
  • To be able to chat to others.
  • To learn to understand others cultures better”.   

We then move to language itself.

Two Views of Language

1. Grammar + words + skills

This is the “wrong” view, which, according to Dellar and Walkley, most people in ELT hold. It says that

language can be reduced to a list of grammar structures that you can drop single words into.

The implications of this view are:

  1. Grammar is the most important area of language. …The examples used to illustrate grammar are relatively unimportant. …It doesn’t matter if an example used to illustrate a rule could not easily (or ever) be used in daily life.
  2. If words are to fit in the slots provided by grammar, it follows that learning lists of single words is all that is required, and that any word can effectively be used if it fits a particular slot.
  3. Naturalness, or the probable usage of vocabulary, is regarded as an irrelevance; students just need to grasp core meanings.
  4. Synonyms are seen as being more or less interchangeable, with only subtle shades of meaning distinguishing them.
  5. Grammar is acquired in a particular order – the so-called “buildings blocks” approach where students are introduced to “basic structures”, before moving to “more advanced ones”.
  6. Where there is a deficit in fluency or writing or reading, this may be connected to a lack of appropriate skills. These skills are seen as existing independently of language .

2. From words with words to grammar

This is the “right” view, and is based on the principle that “communication almost always depends more on vocabulary than on grammar”. The authors illustrate this view by taking the sentence

I’ve been wanting to see that film for ages.

They argue that “Saying want see film is more  likely to achieve the intended communicative message than only using what can be regarded as the grammar and function words I’ve been –ing to that for. “

The authors go on to say that in daily life the language we use is far more restricted than the infinite variety of word combinations allowed by rules of grammar. In fact, we habitually use the same chunks of language, rather than constructing novel phrases from an underlying knowledge of “grammar + single words”.  This leads the authors to argue the case for a lexical approach to teaching  and to state their agreement with Lewis’ (1993) view that

 teaching should be centred around collocation and chunks, alongside large amount of input from texts.  

They go on:

From this input a grasp of grammar ‘rules’ and correct usage would emerge. 

The authors cite Hoey’s Lexical Priming (2005) as giving theoretical support for this view of language.  They explain Hoey’s view by describing the example Hoey gives of the the two words “result” and “consequence”. While these two words are apparently synonymous, they function in quite different ways, as can be seen in statistics from corpora which show when and how they are used.  Dellar and Walkley continue:

Hoey argues that these statistical differences must come about because, when we first encounter these words (he calls such encounters ‘primings’) our brains somehow subconsciously record some or all of this kind of information about the way the words are used. Our next encounter may reaffirm – or possibly contradict – this initial priming, as will the next encounter, and the one after that – and so on. 

The authors go on to explain how Hoey uses “evidence from psycholinguistic studies” to support the claim that we remember words not as individual units, but rather, in pairings and in groups, which allows for quicker and more accurate processing. Thus,

 spoken fluency, the speed at which we read and the ease and accuracy with which we listen may all develop as a result of language users being familiar with groupings of words.

A lexical view of teaching

Dellar & Walkley urge teachers to

think of whole phrases, sentences or even ‘texts’ that students might want to say when attempting a particular task or conversation. ….. At least some of those lexical items are learnable, and some of that learning could be done with the assistance of materials before students try to have particular kinds of communication.

It seems that the biggest problem of teaching lexically is that it’s difficult for teachers to come up, in real time, with the right kind of lexical input and the right kind of questions to help students notice lexical chunks, collocations, etc.. The practicalities of teaching lexically are discussed under the heading “Pragmatism in a grammar-dominated world”, where teachers are advised to work with the coursebooks they’ve got and approach coursebook materials in a different way, focusing on the vocabulary and finding better ways of exploiting it.

The rest of Part 1 is devoted to a lexical view of vocabulary (units of meaning, collocation, co-text, genre and register, lexical sets, antonyms, word form pragmatic meanings and synonyms are discussed), a lexical view of grammar (including “words define grammar” and “grammar is all around”), and a lexical view of skills.

Part 1 ends with “A practical pedagogy for teaching and learning”, which stresses the need to consider “Naturalness, priming and non-native speakers”, and ends with “The Process”, which repeats the 6 processes introduced at the start, noting that noticing and repetition are the two stages that the lexical teacher should place the most emphasis on.

Part B offers 100 worksheets for teachers to work through. Each page shares the same format: Principle; Practising the Principle; Applying the principle. In many of the worksheets, it´s hard to find the “principle” and in most worksheets “applying the principle” involves looking for chances to teach vocabulary, particularly lexical chunks.  Here’s an example:

 Worksheet 2: Choosing words to teach.

Principle: prioritise the teaching of more frequent words.

Practicing the principle involves deciding which words in a box (government / apple for example)  are more frequent and looking at the on line Macmillan Dictionary or the British Corpus to check.

Applying the Principle involves choosing 10 words from “a word list of a unit or a vocabulary exercise that you are going to teach”, putting the words in order of frequency, checking your ideas, challenging an interested colleague with queries about frequency and “keeping a record of who wins!”

The worksheets cover teaching vocabulary lexically, teaching grammar lexically, teaching the 4 skills lexically, and recycling and revising. Many of them involve looking at the coursebook which readers are presumed to be using in their teaching, and finding ways to adapt the content to a more lexical approach to teaching. In the words of the authors,

the book is less about recipes and activities for lessons, and more about training for preparing lexical lessons with whatever materials you are using.       


Part C (10 pages long) looks at materials, teaching courses other than general English, and teacher training.


Language Learning

Let’s start with Dellar and Walkley’s account of language learning. More than 50 years of research into second language learning is “neatly summarised” by listing the 6 steps putatively involved in learning “any given item of language”.  You (1) understand the meaning, (2) hear/see an example in context, (3) approximate the sound, (4) pay attention to the item and notice its features, (5) do something with it – use it some way, and (6) then repeat these steps over time.  We’re not told what an “item” of language refers to, but we may be sure that there are tens, if not hundreds of thousands of such items, and we are asked to believe that they’re all learned, one by one, following the same 6-step process.

Bachman (1990) provides an alternative account, according to which  people learn languages by developing a complex set of competencies, as outlined in the figure below.

There remains the question of how these competencies are developed. We can compare Dellar and Walkley’s 6-step account with that offered by theories of interlanguage development (see Tarone, 2001, for a review). Language learning is, in this view, gradual, incremental and slow, sometimes taking years to accomplish. Development of the L2 involves all sorts of learning going on at the same time as learners use a variety of strategies to develop the different types of competencies shown in Bachman’s model, confronting problems of comprehension, pronunciation, grammar, lexis, idioms, fluency, appropriacy, and so on along the way. The concurrent development of the many competencies Bachman refers to exhibits plateaus, occasional movement away from, not toward, the L2, and U-shaped or zigzag trajectories rather than smooth, linear contours.  This applies not only to learning grammar, but also to lexis, and to that in-between area of malleable lexical chunks as described by Pawley and Syder.

As for lexis, explanations of SLA based on interlanguage development assert that learners have to master not just the canonical meaning of words, but also their idiosyncratic nature and their collocates. When learners encounter a word in a correct context, the word is not simply added to a static cognitive pile of vocabulary items. Instead, they experiment with the word, sometimes using it incorrectly, thus establishing where it works and where it doesn’t. By passing through a period of incorrectness, in which the lexicon is used in a variety of ways, they climb back up the U-shaped curve. Carlucci and Case (2011) give the example of the noun ‘shop.’ Learners may first encounter the word in a sentence such as “I bought this wine at the duty free shop”. Then, they experiment with deviant utterances such as “I am going to the supermarket shop,” correctly associating the word ‘shop’ with a place they can purchase goods, but getting it wrong. By making these incorrect utterances, the learner distinguishes between what is appropriate, because “at each stage of the learning process, the learner outputs a corresponding hypothesis based on the evidence available so far” (Carlucci and Case, 2011).

Dellar and Walkley’s “Six Step” account of language learning is neither well explained nor complete. These are not, I suggest, very robust principles on which to build. The principles of why people learn are similarly flimsy. To say that people learn languages “to be able to do things with their language; to be able to chat to others; and to learn to understand others cultures better” is to say very little indeed.

Two Views of Language

Dellar & Walkley give one of the most preposterous misrepresentations of how most teachers see English grammar that I’ve ever seen in print. Recall that they describe this popular view of language as “grammar + words”, such that language can be reduced to a list of grammar structures that you can drop single words into.

In fact, grammar models of the English language, such as that found in Quirk et.al. (1985), or Swan (2001), and used in coursebooks such as Headway or English File, describe the structure of English in terms of grammar, the lexicon and phonology. These descriptions have almost nothing in common with the description given on page 9 of Teaching Lexically, which is subsequently referred to dozens of times throughout the book as if it were an accurate summary, rather than a biased straw man used to promote their own view of language. The one sentence description, and the 6 simplistic assumptions that are said to flow from it, completely fail to fairly represent grammar models of the English language.

The second view of language, the right one according to the authors, is “language = from words + words to grammar”. Given that this is the most important, the most distinguishing, feature of the whole approach to teaching lexically, you’d expect a detailed description and a careful critical evaluation of their preferred view of language. But no; what is offered is a poorly articulated inadequate summary, mixed up with one-sided arguments for teaching lexically. It’s based on Hoey’s (2005) view that the best model of language structure is the word, along with its collocational and colligational properties, so that collocation and “nesting” (words join with other primed words to form sequence) are linked to contexts and co-texts, and grammar is replaced by a network of chunks of words. There are no rules of grammar; there’s no English outside a description of the patterns we observe among those who use it. There is no right or wrong in language. It makes little sense to talk of something being ungrammatical.

This is surely a step too far; surely we need to describe language not just in terms of the performed but also in terms of the possible. Hoey argues that we should look only at attested behaviour and abandon descriptions of syntax, but, while nobody these days denies the importance of lexical chunks, very few would want to ignore the rules which guide the construction of novel, well formed sentences. After all, pace Hoey, people speaking English (including learners of English as an L2) invent millions of novel utterances every day.  They do so by making use of, among other things, grammatical knowledge.

The fact that the book devotes some attention to teaching grammar indicates that the authors recognise the existence and importance of grammar, which in turn indicates that there are limits to their adherence to Hoey’s model. But nothing is said in the book to clarify these limits. Given that Dellar and Walkley repeatedly stress that their different view of language is what drives their approach to teaching,  their failure to offer any  coherent account of their own view of language is telling. We´re left with the impression that the authors are enthusiastic purveyors of a view which they don’t fully understand and are unable to adequately describe or explain.

Teaching Lexically

1. Teaching Lexically concentrates very largely on “doing things to learners” (Breen, 1987): it’s probably the most teacher-centred book on ELT I’ve ever read. There’s no mention in the book of including students in decisions affecting what and how things are to be learned: teachers make all the decisions. They work with a pre-confected product or synthetic syllabus, usually defined by a coursebook, and they plan and execute lessons on the basis of adapting the syllabus or coursebook to a lexical approach. Students are expected to learn what is taught in the order that it’s taught, the teacher deciding the “items”, the sequence of presentation of these “items”, the recycling, the revision, and the assessment.

2.  There’s a narrowly focused, almost obsessive concentration on teaching as many lexical chunks as possible. The need to teach as much vocabulary as possible pervades the book. The chapters in Part B on teaching speaking, reading, listening and writing are driven by the same over-arching aim: look for new ways to teach more lexis, or to re-introduce lexis that has already been presented.

3. Education is seen as primarily concerned with the transmission of information. This view runs counter to the principles of learner-centred teaching, as argued by educators such as John Dewey, Sebastian Faure, Paul Friere, Ivan Illich, and Paul Goodman, and supported in the ELT field by all progressive educators who reject the view of education as the transmission of information, and, instead, see the student as a learner whose needs and opinions have to be continuously taken into account. For just one opinion, see  Weimer (2002) who argues for the need to bring about changes in the balance of power; changes in the function of course content; changes in the role of the teacher: changes in who is responsible for learning; and changes in the purpose and process of evaluation.

4. The book takes an extreme interventionist position on ELT.  Teaching Lexically involves dividing the language into items, presenting them to learners via various types of carefully-selected texts, and practising them intensively, using pattern drills, exercises and all the other means outlined in the book, including comprehension checks, error corrections and so on, before moving on to the next set of items.  As such, it mostly replicates the grammar-based PPP method it so stridently criticises. Furthermore, it sees translation into the L1 as the best  way of dealing with meaning, because it wants to get quickly on to the most important part of the process , namely memorising bits of lexis with their collocates and even co-text.  Compare this to an approach that sees the negotiation of meaning as a key aspect of language teaching, where the lesson is conducted almost entirely in English and the L1 is used  sparingly, where students have chosen for themselves some of the topics that they deal with, where they contribute some of their own texts, and where most of classroom time is given over to activities where the language is used communicatively and spontaneously, and where the teacher reacts to linguistic problems as they arise, thus respecting the learners’ ‘internal syllabus’.

Teaching Lexically sees explicit learning and explicit teaching as paramount, and it assumes that explicit knowledge, otherwise called declarative knowledge, can be converted into implicit (or procedural) knowledge through practice. These assumptions, like the assumptions that students will learn what they’re taught in the order they’re taught it, clash with SLA research findings. As Long says: “implicit and explicit learning, memory and knowledge are separate processes and systems, their end products stored in different areas of the brain” (Long, 2015, p. 44).  To assume, as Dellar and Walkley do, that the best way to teach English as an L2 is to devote the majority of classroom time to the explicit teaching and practice of pre-selected bits of the language is to fly in the face of SLA research.

Children learn languages in an implicit way – they are not consciously aware of most of what they learn about language. As for adults, all the research in SLA indicates that implicit learning is still the default learning mechanism. This suggests that teachers should devote most of the time in class to giving students comprehensible input and opportunities to communicate among themselves and with the teacher.

Nevertheless, adult L2 learners are what Long calls partially “disabled” language learners, for whom some classes of linguistic features are “fragile”. The implication is that, unless helped by some explicit instruction, they are unlikely to notice these fragile (non-salient )features, and thus not progress beyond a certain, limited, stage of proficiency.  The question is: What kind of explicit teaching helps learners progress in their trajectory towards communicative competence?  And here we arrive at lexical chunks.

Teaching Lexical Chunks

One of the most difficult parts of English for non native speakers to learn is collocation. As Long (2015, pages 307 to 316) points out in his section on lexical chunks, while children learn collocations implicitly, “collocation errors persist, even among near-native L2 speakers resident in the target language environment for decades.” Long cites Boers work, which suggests a number of reasons for why L2 collocations constitute such a major learning  problem, including L1 interference, the semantic vagueness of many collocations, the fact that collocates for some words vary , and the fact that some collocations look deceptively similar.

The size and scope of the collocations problem can be appreciated by considering findings on the lesser task of word learning. Long cites work by Nation (2006) and Nation and Chung (2009) who have have calculated that learners require knowledge of between 6000 and 7000 word families for adequate comprehension of speech and 9000 for reading. Intentional vocabulary learning has been shown to be more effective than incidental learning in the short tem, but, the authors conclude, “there is nowhere near enough time to handle so many items in class that way”.  The conclusion is that massive amounts of extensive reading outside class, but scaffolded by teachers, is the best solution.

As for lexical chunks, there are very large numbers of such items, probably hundreds of thousands of them. As Swan (2006) points out, “memorising 10 lexical chunks a day, a learner would take nearly 30 years to achieve a good command of 10,000 of them”. So how does one select which chunks to explicitly teach, and how does one teach them? The most sensible course of action would seem to be to base selection on frequency , but there are problems with such a simple criterion, not the least being the needs of the set of students in the classroom. Although Dellar and Walkley acknowledge the criterion of frequency, Teaching Lexically gives very little discussion of it, and there is very little clear or helpful advice offered about what lexical chunks to select for explicit teaching, – see the worksheet cited at the start of this review. The general line seems to be: work with the material you have, and look for the lexical chunks that occur in the texts, or that are related to the words in the texts. This is clearly not a satisfactory criterion for selection.

The other important question that Teaching Lexically does not give any well considered answer to  is: how best to facilitate the learning of lexical chunks?  Dellar and Walkley could start by addressing the problem of how their endorsement of Hoey’s theory of language learning, and Hoey’s “100% endorsement” of Krashen’s Natural Approach, fit with their own view that explicit instruction in lexical chunks should be the most important part of classroom based instruction. The claim that they are just speeding up the natural, unconscious process doesn’t bear examination because two completely different systems of learning are being conflated. Dellar and Walkley take what’s called a “strong interface” position, whereas Krashen and Hoey take the opposite view. Dellar and Walkley make conscious noticing the main plank in their teaching approach, which contradicts Hoey’s claim that lexical priming is a subconscious process.

Next, Dellar and Walkley make no mention of the fact that learning lexical chunks is one of the most challenging aspects of learning English as an L2 for adult learners.  Neither do they discuss the questions related to the teachability of lexical chunks that have been raised by scholars like Boers (who confesses that he doesn’t know the answer to the problems they have identified about how to teach lexical chunks). The authors of Teaching Lexically blithely assume that drawing attention to features of language (by underlining them, mentioning them and so on), and making students aware of collocations, co-text, colligations, antonyms, etc., (by giving students (repeated) exposure to carefully-chosen written and spoken texts, using drills, concept questions, input flood, bottom-up comprehension questions, and so on) will allow the explicit knowledge taught to become fully proceduralised.  Quite apart from the question of how many chunks a teacher is expected to treat so exhaustively, there are good reasons to question the assumption that such instruction will have the desired result.

In a section of his book on TBLT, Long (2015) discusses his 5th methodological principle: “Encourage inductive ·chunk” learning”.  Note that Long discusses 10 methodological principles, and sees teaching lexical chunks as an important but minor part of the teacher’s job. The most important concluson that Long comes to is that there is, as yet, no satisfactory answer to “the $64,000 dollar question: how best to facilitate chunk learning”.  Long’s discussion of explicit approaches to teaching collocations includes the following points:

  • Trying to teach thousands of chunks is out of the question.
  • Drawing learners attention to formulaic strings does not necessarily lead to memory traces usable in subsequent receptive L2 use, and in any case there are far too many to deal with in that way.
  • Getting learners to look at corpora and identify chunks has failed to produce measurable advantages.
  • Activities to get learners to concentrate on collocations on their own have had poor results.
  • Grouping collocations thematically increases the learning load (decreasing transfer to long term memory) and so does presentation of groups which share synonymous collocates, such as make and do.
  • Exposure to input floods where collocations are frequently repeated has poor results.
  • Commercially published ELT material designed to teach collocations have varying results. For example, when lists of verbs in one column are to be matched with nouns in another, this inevitably produces some erroneous groupings that, even when corrective feedback is available, can be expected to leave unhelpful memory traces.
  • It is clear that encouraging inductive chunk learning is well motivated, but it is equally unclear how best to realise it in practice, i.e., which pedagogical procedures to call upon.


Teaching Lexically is based on a poorly articulated view of the English language and on a flimsy account of second language learning. It claims that language is best seen as lexically driven, that a grasp of grammar ‘rules’ and correct usage will emerge from studying lexical chunks, that spoken fluency, the speed at which we read, and the ease and accuracy with which we listen will all develop as a result of language users being familiar with groupings of words, and that therefore, the teaching of lexical chunks should be the most important part of a classrooms teacher’s job. These claims often rely on mere assertions, and include straw man fallacies, cherry picking the evidence of research findings and ignoring counter evidence. The case made for this view of teaching is in my opinion, entirely unconvincing. The concentration on just one small part of what’s involved in language teaching, and the lack of any well considered discussion of the problems associated with teaching lexical chunks, are seriously flaws in the book’s treatment of an interesting topic.


Bachman, L. (1990). Fundamental considerations in language testingOxford University Press.

Breen, M. (1987) Contemporary Paradigms in Syllabus Design, Parts 1 and 2. Language Teaching 20 (02) and 20 (03).

Carlucci, L. and Case, J.  (2013)  On the Necessity of U-Shaped Learning. Topics.

Hoey, M.(2005) Lexical Priming. Routeledge.

Long, M. (2015) Second Language Acquisition and Task Based Language Teaching. Wiley.

Swan, M. (2006) Chunks in the classroom: let’s not go overboard. The Teacher Trainer, 20/3.

Tarone, E. (2001), Interlanguage. In R. Mesthrie (Ed.). Concise Encyclopedia of Sociolinguistics. (pp. 475–481) Oxford: Elsevier Science.

Weimer, M. (2002) Learner-Centered Teaching. Retrieved from http://academic.pg.cc.md.us/~wpeirce/MCCCTR/weimer.htm  3/09/2016

Anybody seen a pineapple?

Usage-based (UB) theories see language as a structured inventory of conventionalized form-meaning mappings, called  constructions, Thus, the first thing one needs to get a handle on is Construction Grammar, which is summarised in Wuff & Ellis (2018). I’ve just been reading Smiskova-Gustafsson’s (2013) doctoral thesis and her brief summary of Nick Ellis’ UB theory reminded me of why I find it so wierd.  Acording to N. Ellis, detecting patterns from frequency of forms in input is the way people learn language: when exposed to language input, learners notice frequencies and discover language patterns. Those advocating Construction Grammar insist that the regularities that learners observe in the input emerge from complex interactions of a multitude of variables over time, and that, therefore, the regularities in language we call grammar are not rule-based; rather, they emerge as patterns from the repeated use of symbolic form-meaning mappings by speakers of the language. “Therefore, grammar is not a set of creative rules but a set of regularities that emerge from usage” (Hopper, 1998, cited in Smiskova-Gustafsson, 2013). Emergent structures are nested; consequently, any utterance consists of a number of overlapping constructions (Ellis & Cadierno, 2009, cited in Smiskova-Gustafsson, 2013). Linguistic categories are also emergent, – they emerge bottom-up and thus not all linguistic structures fall easily into prescribed categories. In other words, some linguistic structures are prototypical, while others fit their category less well.

Examining some of the abstract constructions developed by UB scholars, Smiskova-Gustafsson’s (2013) notes that frequency of forms interacts with psycholinguistic factors, most importantly, prototypicality of meaning. She gives the example of the verb-argument construction “V Obj Oblpath/loc”, or VOL, an abstract construction that enables syntactic creativity by abstracting common patterns from lexically specific exemplars such as put it on the table:

The exemplar itself is a highly frequent instantiation of the VOL construction, and the verb put that it contains is prototypical in meaning. This means that put is the verb most characteristic of the VOL construction and so the one most frequently used. Other verbs in VOL are used less; the type/token distribution of all verbs used in the VOL construction is Zipfian (i.e., the verb put is the one most frequently used, about twice as frequently as the next verb). Such prototypes are crucial in establishing the initial form/meaning mapping – in this case, the phrase put it on the table, meaning caused motion (X causes Y to move to a location). Repeated exposure to other instantiations of the VOL construction will gradually lead to generalization and the establishment of the abstract productive construction (Smiskova-Gustafsson (2013, p. 18).

Got it? If you find that taster a rather abstract and obtuse way to try to explain how input can of itself contain all the information learners need to learn English as an L2 (for example), then try reading Wuff & Elllis (2018), or the Approaches book in the graphic above. But what about chunks? From a UB perspective, chunks are “conventionalized formmeaning mappings, the result of repeated use of certain linguistic units, which then give rise to emergent patterns in language at all levels of organization and specificity” (Smiskova-Gustafsson, 2013, p. 21). Chunks go from word sequences that are semantically opaque (spill the beans) or structurally irregular (by and large) to everyday usage such as in my opinion, or nice to see you.  And here’s the rub: as Smiskova-Gustafsson (2013, p. 21) points out, “if we take a usage-based perspective, where all units of language are conventionalized, identifying chunks would become pointless, since we could say that all language is in fact a chunk”. The natural, seamless flow of native-like language use can thus be seen as “formulaicity all the way down” (Wray, 2012 p. 245, cited in Smiskova-Gustafsson, 2013, p. 22): language consists of almost endless overlapping and nesting of chunks, as in this example:

In winter Hammerfest is a thirty-hour ride by bus from Oslo, though why anyone would want to go there in winter is a question worth considering.

thirty hour ride by bus from

[thirty hour [[ride][ by bus]] from]]

chunks: thirty hour ride, ride by bus, by bus, by bus from, etc.

though why anyone would want to go there

[though [why] anyone would] [want to] go] there]

chunks: though why, why anyone would, why anyone would want to, want to go, etc. ( Smiskova-Gustafsson, 2013, p. 11).

Since learners of English as an L2 tend to use English in terms of grammar and individual words, and often combine words in awkward ways, their lack of the ability to produce “natural, seamless flows of native-like language use” must be because they don’t have the necessary procedural knowledge of the Construction Grammar which underpins the  “conventionalized English ways” of expressing any particular concept.

The question is, of course, Is this a good way to see language and language learning? If it is, then how do teachers of English as an L2 help their students develop proficiency? How do they teach students English, if it amounts to no more – and no less! – than procedural knowledge of Construction Grammar, the pre-requisite for the proficient use of tens of thousands of overlapping and nested chunks? To be thorough, if teachers accepted the UB approach, then instead of following the confused and contradictory advice offered by Dellar & Walkley or by Selivan, they would first have to understand Construction Grammar, then understand UB theories of language learning, and then articulate methodological principles and pedagogical practices for implementing a principled lexical approach.  Were teachers to attempt this, I suggest that they’d find Construction Grammar more difficult and less useful than the grammar described in Swan’s Practical English Usage; Ellis’ UB theory more difficult and less useful than the theories described in Mitchell & Myles (2019) Second Language Learning Theories; and accounts of methodological principles and pedagogical practices found in Teaching Lexically or Lexical Grammar less convincing than the account of them found in Chapter 3 of Long’s (2015) SLA & TBLT.

UB theories are increasingly fashionable, but I’m still not impressed with Construction Grammar, or with the claim that language learning can be explained by appeal to noticing regularities in the input. As to the latter, I recommend Gregg’s (2003) article; seventeen years later, I’ve still not read a convincing reply to it. Anyway, I think it’s fair to say that there’s no consensus among SLA scholars on the question of whether language learning is done on the basis of input exposure and experience or by the help of innate knowledge of learners, and it’s still not clear whether grammatical learning is usage-based or universal grammar-based.

Meanwhile, it seems sensible for teachers to continue to regard English as a language with grammar rules that can help students make well-formed (often novel) utterances, and to help their students by giving them maximum opportunities to use English for their own relevant, communicative purposes, while encouraging inductive learning of chunks. Likewise, it seems foolish to accept the counsel of teacher trainers who misrepresent the complexities of a UB approach and who recommend teachers to focus on the impossible task of explicitly teaching lexical chunks.


Dellar, H. and Walkley, A. (2016) Teaching Lexically. Delta.

Gregg, K. R. (2003) The State of Emergentism in SLA. Second Language Research, 19,2, 95-128.

Selivan, L. (2018) Lexical Grammar. CUP.

Smiskova-Gustafsson, H. (2013). Chunks in L2 development: a usage-based perspective. Groningen: s.n.

Wuff. S. and Ellis, N.  (2018) Usage-based approaches  to second language acquisition. Downloadable here: https://www.researchgate.net/publication/322779469_Usage-based_approaches_to_second_language_acquisition


Alternative Proposal for IATEFL Global Get-Together 2020

IATEFL’s proposal for a global get-together is a disappointing, lack-lustre programme that perfectly reflects its status as the stuck-in-the-mud, unimaginative voice of current, commercially-driven ELT practice. The perfect example of this lamentable state of affairs is that Catherine Walters, one of the most reactionary voices in ELT over the last four decades and President of IATEFL in 1993, is asked to address the most crucial issue currently facing us, namely, how to adapt classroom teaching to distance learning. The blurb for her presentation Losing Our Bells and Whistles: Will asynchronous teacher education return? suggests that she’ll do nothing more than warn teachers of the perils of cutting edge innovation. “Keep it simple!”, she’ll say. “Don’t try any clever synchronous stuff – it always goes wrong!”. That’s it: that’s IATEFL for you.

As for the rest of the programme, what can we find that might possibly drag us away from Netflix? The President’s address? Tell me a President’s address that you remember anything about! Will poor David Crystal, dragged out yet again, this time to promote the new edition of his Big Book do more than entertain? I doubt it. How about somebody selling a commercial Business English test? Definitely not. And advice on how to be mindful, or eulogies to young learners as global citizens? Useless pap is my guess. The only things that might be interesting are the local reports, but they’re not properly situated or focused.

Here’s my suggestion.

Re-examining Principles of ELT

All sessions last 2 hours. They’re round table discussions with a Moderator. Each speaker has 10 minutes. Questions are sent in to the organisers

Session 1: How do people learn an L2? : S. Gass, N. Ellis, M. Pienemann, S. Carroll, K. Gregg

The main debate these days is between emergentists (we learn from input from the environment) and nativists (we learn with help from innate hard wiring). Where are we now? What principles can we agree on which will underline our work as teachers?

Session 2: Teaching implications of SLA Findings: L. Ortega, A. Benati, M. Long, H. Marsden, H. ZhaoHong, H. Nassaji

Recent research findings have challenged previously accepted meta-analyses. Where are we now? Most importantly: can we agree on the relative importance of explicit and implicit teaching?

Session 3:  Syllabus design:  R. Ellis, M. Swan, M.Long, S Thornbury, C. Doughty

The big debate today is between synthetic syllabuses, as implemented in General English Coursebooks, and analytic syllabuses, like Long’s TBLT and Thornbury & Meddings Dogme.  This is probably the second most important question of them all. We need to clarify all the “false” alternatives and agree on principles for syllabus design and materials production.

Session 4: Distance Learning: G.Mottram, G. Dudney, C. Chapelle.

Tech experts present their platforms and respond to questions sent in by participants prior to the conference. .

Session 5: ELT as a profession: TEFL Workers Union, N. McMilan, S. Millin, S. Brown, R. Bard.

The most important question of them all. How do we improve our lot? How do we organise?  Ideally, we should produce a Manifesto.

Session 6: Hope For the Future: T. Hampson, M. Griffin, J. Mackay, K. Linkova, L. Havaran

This is my own, very personal choice of teachers, new and old, whose voices need to be heard.

A 2-day programme, properly organised, would allow the invited speakers to briefly state their cases and for discussion to ensue. I think the success of the event would depend on careful monitoring and follow up. The organisers would have to edit the material and then get back to contributors to help compile really solid take away stuff. Ideally, we’d have Summary Statements on each of the 6 issues and the beginnings of a network.

I’m confident that I could organise such an event if

  1. Neil McMillan did it with me (I haven’t even mentioned this to him yet!)
  2. We had a small group of helpers, and
  3. we had some cash.

I invite comments.