Review of JPB Gerald (2022) “Antisocial Language Teaching”


The main problem with Gerald’s book is its lack of clarity: part autobiography, part political pamphlet, part college essay, part post-graduate academic assignment, it’s discussion of whiteness and language teaching lurches from one genre to another, and from one topic to another, rarely retaining its style or focus long enough to offer any clear descriptions or analyses of the motley matters it so unevenly and erratically tries to cover. There’s no clear history of colonialism or the slave trade offered here, no clear description or discussion of racism, or capitalism, or alienation, or ethics, or sociology, or linguistics, or education, or anything else that might clarify what, precisely, whiteness is, or what specifically is wrong with current language teaching. Despite his claim to see things from a fresh, new “angle”, just about everything in the text that deals with whiteness has already been dealt with more eruditely, insightfully, and, above all, more clearly, by previous writers. As for language teaching, Gerald’s discussion is flimsy, badly-informed and confused.

In what follows, I quote from my e-book verson of the book, so I apologise for not being able to give page numbers for the quotes.

The Writing

Part of the book’s lack of clarity is due to the poor writing, and a particular feature of this is that many sentences and paragraphs of the text dissolve into incoherence, as if they’ve suddenly fallen off a cliff.

The Introduction begins with a discussion of what Tucker Carlson, a Fox News presenter, says about ‘antisocial thugs with no stake in society”. Following a summary of Carlson’s views, Gerald comments:

He [Carlson] and his writers are not espousing some fringe viewpoint but instead emphasizing a core tenet of his popular ideology, namely the fact that decentralized resistance and opposition to the hegemony of whiteness is anathema to what he refers to as ‘society’, and the common elision of Blackness and criminality as expressed via his use of the word ‘thug’ (Smiley & Fakunle, 2016). As odious as his ideas are to many who might be reading this work, Carlson is not speaking out of turn when compared to the epistemology and the ideology of the whiteness that retains a firm grip on the globe.

I’m guessing when I suggest that the first part can be paraphrased: Carlson believes that all resistance to white supremacy is an attempt to destroy American society, but I have no idea what the final sentence means. At best, the passage is badly-expressed and involves a confusing non-sequitur; at worst, it’s gibberish.

The Introduction eventually arrives at this clumsy attempt to explain what it’s about:   

Simply put, this book exists to make the case for why it is a moral imperative that ELT severs its ties to whiteness once and for all, and for the bright future that could follow if we ever manage to demolish this structure inside of which we are all trapped.

Simply put, the book argues that ELT must sever all ties with whiteness. The rest of the sentence doesn’t follow and renders the whole thing incoherent.  

One final example. Gerald says:

In short, the concept of ‘society’, against which antisocial and other ‘disordered’ behavior is measured, is merely a mask for whiteness, and considering that the epistemology responsible for these diagnostic criteria is itself an exemplar of whiteness, it is difficult to trust whiteness as an objective judge of what is and is not antisocial.

Again, I’m not sure what the assertion that the white supremacist concept of society is a mask for whiteness against which antisocial and disordered behaviour are measured amounts to, but I haven’t the slightest idea what epistemology he’s talking about. The word ‘epistemology’, occurs sixteen times in the text, and seems to be used to mean something like “system of beliefs” or “ideology”. In the two quotes above, Gerald talks about  “the epistemology and the ideology” of whiteness, and “the epistemology” that is “an exemplar of whiteness”.  Elsewhere he talks about the “epistemological analysis” of axes of oppression; a “fuzzy epistemology” reliant on “race scholarship”; and his own trips down different “epistemological corridors”, to take just three more examples. Epistemology is, of course, the branch of philosophy that deals with theories of knowledge, and nowadays the big debate is between realists, who assume that’s there’s a world out there independent of our experiences of it, which can be more or less accurately observed and described, and relativists who deny the realists’ claim. I suggest that Gerald’s use of the word ‘epistemology” has little to do with this normal use of the word, and, furthermore, that it’s just one example of his failure to clearly define a profusion of key terms and constructs, or to use them consistently. Gerald is a bit like Carroll’s Humpty Dumpty:

The Content

Let’s turn to the content. After the prologue (which gives good warning that the book’s about JPB Gerald, really), the Introduction gives its first sketch of “society” as seen by white supremacists, and ends with a section on “Key Concepts”. These include


… the combination of racial discrimination and societal oppression. Anyone can experience the former, but only certain people can experience the combination of the two. For example, as a Black person, I could tell you I don’t want to have any white friends, and that would absolutely be discriminatory, but because I do not have the full power of society behind me, and because that would not materially impact the people I denied my friendship, it does not qualify.

and whiteness    

there is no functional difference between whiteness and white supremacy. Indeed, whiteness, as a concept, was created to justify colonialism and chattel slavery (Bonfiglio, 2002; Painter, 2011); there had to be a group exempt from these horrors, and as such, whiteness was codified. Whiteness was created to be supreme, as a protection from the oppression that others deserve because of the groups into which they have been placed.

Gerald proceeds to give an overview of the book. Part One deals with Disorder.  

In short, whiteness requires people to be categorized as either ordered or disordered so that it can function effectively and to support its aims of colonialist dominance and capitalism . Accordingly, whiteness uses language ideologies and language teaching to classify Blackness, dis / ability and unstandardized English as representations of pathology and disorder, and is thus able to justify its exploitation and oppression of members of these groups.

In Part 2 , Gerald “demonstrates” how “the field of ELT and its adherence to whiteness” has led to “pervasive oppression”. To do so, he maps its “harmful habits” onto the official criteria for antisocial personality disorder, “not to stigmatize the disorder but to counterpathologize whiteness and the destruction it causes”.

Finally, Part 3 discusses how language teachers can “play a central role in the demolition of whiteness in our field and in our society”.

Part One

Part One begins with an attempt to define whiteness. In essence, Gerald sees it as “The Great Pyramid Scheme”.

When I thought of the best way to describe whiteness and the way it had been sold to me, despite rarely being named as such , I consulted the numerous metaphors that have been used in the literature, many of which remain accurate and resonant, many of which I will cite below. But, in my opinion, when searching for the best way to evoke the sheer confidence game at play, one that empowers a few while convincing the masses that their own power is waiting just around the corner so long as they convince everyone they know to also buy in, I could think only of the sad stories I’ve encountered of friends and acquaintances who were convinced to buy thousands of dollars of terrible products that they could never offload to others.

If I understand him, Gerald sees whiteness as the ultimate Ponzi scheme. He goes on:

Simply put, whiteness is perhaps the world’s greatest example of multilevel marketing, a massive pyramid scheme, but unlike the companies stealing from put – upon individuals and families, there is no single chief executive officer ( CEO ) laughing all the way to the bank. At this point, whiteness feeds upon all of us, including the people who bow before it, and it creates no victors, only a desperate battle to avoid losing.

For the next few pages, Gerald relies on Painter’s (2011) work, starting with his description of pre-industrial societies. Gerald comments:

There are the beginnings of a constructed hierarchy visible in this description, of course, but oppression based on group membership did not originate with the construction of whiteness – it simply had a different manifestation. Much later on, even after slavery was common in Europe, ‘Geography, not race, ruled, and potential white slaves, like vulnerable aliens everywhere, were nearby for the taking’ (Painter, 2011: 38). People with power have always exploited those without it, and it would be inaccurate to blame whiteness for what is clearly an upsettingly human tendency.

Is this clear? Not to me it isn’t, but before it gets clarified, Gerald is off on a history of the slave trade, colonialism, eugenics and accounts of assorted atrocities. For example:

In what became the United States, Europeans brought disease alongside their ships, but smallpox didn’t succeed in eliminating everyone, so they were forced to remove them directly in order to control their land (Wolfe, 2006).

Rather than be given any clear, concise definition, we’re left to slowly glean for ourselves what whiteness means, until right at the end of Part 1, when Gerald presents a summary under several headings. Whiteness is “a Pyramid Scheme”; it “Justifies Settler Colonialism and Racial Capitalism”; it “Created Blackness out of its Own Darkest Impulses”; it “Dis/abled Blackness to Ensure its Subjugation”; it “Uses Perceived Deficits in Ability, Intelligence and Language to Retain Power”; and it “Devalues Unstandardized English because it Devalues the Racialized”.

Part 2

Gerald explains on his publisher’s webpage dedicated to the book that in Part 2,  

As a rhetorical device, I use the diagnostic criteria of the Diagnostic and Statistical Manual of Mental Disorders to make the point that the way our field was built and is currently maintained could be classified as deeply disordered and only isn’t because of who currently benefits from the system as is; more specifically, I map the seven criteria of antisocial personality disorder onto the connection between whiteness, colonialism, capitalism, and ableism and how these and other -isms harm the vast majority of the students – and educators – in the field of language teaching.

There are thus seven chapters in Part 2. They deal with

  1. Failure to conform to social norms concerning lawful behaviour
  2. Deceitfulness, repeated lying, use of aliases, or conning others for pleasure or personal profit
  3. Impulsivity or failure to plan
  4. Irritability and aggressiveness, often with physical fights or assaults
  5. Reckless disregard for the safety of self or others
  6. Consistent irresponsibility, failure to sustain consistent work behavior, or honor monetary obligations
  7. Lack of remorse, being indifferent to or rationalizing having hurt,mistreated or stolen from another person

As indicated above, Gerald tries to map these 7 criteria onto whiteness, etc., so as to highlight the harm done to nearly everybody in the field of language teaching. Personally, I don’t think this is a very successful rhetorical device; the only reason I can see for using these seven criteria to organize a discussion of how whiteness affects ELT is that, since Gerald himself has been diagnosed with a mental disorder, it gives him a more authoritative “voice”. Unfortunately, it doesn’t give the text clarity. Among the pages, there are interesting ideas trying to get out. Gerald has justified concerns about native speakerism, persistent language deficit views in education, ongoing linguistic imperialism, hopelessly unfit-for-purpose assessment procedures, and on and on. He also indicates where he stands on debates about multilingualism, additive bilingualism, translanguaging, and other issues. But his concerns and his viewpoint are expressed in a text which too often lacks both coherence and cohesion. Ideas trip over themselves, there’s too much verbose hyperbole and too little attention to developing an argument through the use of clearly-defined constructs and well-chosen cohesive devices.

I’m aware that my criticisms of Gerald’s writing might be interpreted as those of a white man defending “standardized linguistic practices”, which, far from being an objective set of linguistic forms appropriate for an academic setting, are actually demonstrations of raciolinguistic ideologies which expect language-minoritized students to model their linguistic practices after the white speaking subject. I should make it clear that I don’t defend the petty rules of English for academic purposes which are foisted on students; my only concern about the writing in this text is its readability, and that depends on its coherence and cohesion. Many – probably most – of those who read Gerald’s book will disagree with my criticisms. In my defense, I’d challenge them to read any of the pages I’ve referred to in this review and then give a brief precis of what they’ve just read.

Part 3

Part 3 has two chapters: the first discusses a teacher education course, and the second makes seven nebulous recommendations on how to improve ELT. Dealing with the second chapter first, it suggests that we call ourselves teachers of standardized English (TSE) instead of English language teachers, that teacher education includes a “deep engagement” with all the issues raised, that we use better materials (improved by a similar deep engagement with the book’s message), and a few more anodyne bits of fluff.

Chapter 1, the Ezel Project, describes a teacher training course, and it is, in my opinion, by far the best chapter in the book. (I should say at once that I don’t like the style, but that doesn’t matter, because the text is quite readable.) Gerald gives a detailed account of the design and implementation of his course, which aims to raise teacher awareness of the damage caused by continued racism and white supremacy, and follows it with interesting accounts of how some of the participants reacted.  


It seems clear to me that all English language teachers should accept the following tenets:

  • English acts as a lingua franca and as a powerful tool to protect and promote the interests of a capitalist class.
  • In the global ELT industry, teaching is informed by the monolingual fallacy, the native speaker fallacy and the subtractive fallacy (Phillipson, 2018).   
  • The ways in which English is privileged in education systems needs critical scrutiny, and policies that strengthen linguistic diversity are needed to counteract linguistic imperialism.
  • Racism permeates ELT. It results in expecting language-minoritized students to model their linguistic practices on inapproriate white speaker norms.
  • ELT practice must acknowledge bilingual’s fluent languaging practices and legitimise hybrid language uses.
  • ELT must encourage practices which explore the full range of users’ repertoires in creative and transformative ways.
  • Subtractive approaches to language education and deficit language policies must be resisted.

From what I’ve read, I’d say that there are many articles – Ian Cushing’s work, for example, always has a good references section – that deal with white supremacy and raciolinguistic issues better than Gerald’s book does.

As for Gerald’s assorted assertions about the global ELT industry, they demonstrate a poor grasp of the literature on language learning, syllabus design, pedagogic procedures, assessment, and language policy. The one reference Gerald makes to SLA research is tellling. He says

the related field of second language acquisition expends considerable effort on boiling its namesake process down to formulas that hardly take the individuals and their identities involved into account, rendering any supposed struggles a more personal failing than they truly are.

Of course (the name’s a clue), psycholinguistic studies of the SLA process include studies of the individual psychology of learners – their motivation, their perceptions of their L2 selves, and their anxieties. But to suggest that SLA research makes a special effort to boil the process of L2 learning down to formulas is to do no more than glibly misrepresent the views of certain sociolinguistic relativists who, while happy to disparage “positivist paradigms”, are unlikely to go along with Gerald’s remark.

Radical action is needed to combat racism and white supremacy in ELT, but that action needs to be part of an attack on ELT which includes a clear, comprehensive, practical alternative, based on robust findings of research into how people learn additional languages. Gerald aligns himself with teacher educators like Vaz Bauler and academics like García who adopt a relativist epistemology (sic) and embrace an “anything goes” approach to teaching, where error correction and assessment are seen as “harmful”, and where the construct of “language” itself is illegitimate. For these sociolinguistic vanguardistas, bilingual students’ language practices must not be separated into home language and school language (there are no such thing as “distinct language systems”), the construct of transfer must be abandoned, and in its place we must put “a conceptualization of integration of language practices in the person of the learner” (García & Wei, 2014, p. 80). I question the value of these arguments when they’re propounded by their authors, and I certainly don’t trust Gerald’s garbled version of them.   


Garcia, O. & Wei, L. (2014). Translanguaging: Language, Bilingualism, and Education. Palgrave MacMillan.

Gerald, JPG. (2022). Antisocial Language Learning: English and the Pervasive Pathology of Whiteness. Multilingual Maters.

Phillipson, R. (2018). Linguistic Imperialism. Routledge.

Empiricist Emergentism


Emergentism is an umbrella term referring to a fast growing range of usage-based theories of SLA which adopt “connectionist” and associative learning views, based on the premise that language emerges from communicative use. Many proponents of emergentism, not least the imaginative Larsen-Freeman, like to begin by pointing to the omnipresence of complex systems which emerge from the interaction of simple entities, forces and events. Examples are:

The chemical combination of two substances produces, as is well known, a third substance with properties different from those of either of the two substances separately, or both of them taken together. Not a trace of the properties of hydrogen or oxygen is observable in those of their compound, water. (Mill 1842, cited in O’Grady, 2021).

Bee hives, with their carefully arranged rows of perfect hexagons, far from providing evidence of geometrical ability in bees actually provides evidence for emergence – The hexagonal shape maximizes the packing of the hive space and the volume of each cell and offers the most economical use of the wax resource… The bee doesn’t need to “know” anything about hexagons. (Elman, Bates, Johnson, Karmiloff-Smith Parisi & Plunkett, 1996, cited in O’Grady, 2021).

Larsen-Freeman’s own favorite is a murmuration of starlings, as in the photo, above. In her plenary at the IATEFL 2016 conference, the eminent scholar seemed almost to float away herself, up into the rafters of the great hall, as she explained:

Instead of thinking about reifying and classifying and reducing, let’s turn to the concept of emergence – a central theme in complexity theory. Emergence is the idea that in a complex system different components interact and give rise to another pattern at another level of complexity.

A flock of birds part when approached by a predator and then they re-group. A new level of complexity arises, emerges, out of the interaction of the parts.

All birds take off and land together. They stay together as a kind of superorganism. They take off, they separate, they land, as if one.

You see how that pattern emerges from the interaction of the parts?

Personally, I fail to grasp the force of this putative supporting evidence for emergentism, which strikes me as unconvincing, not to say ridiculous. I find the associated claim that complex systems exhibit ‘higher-level’ properties which are neither explainable, nor predictable from ‘lower-level’ physical properties, but which, nevertheless have causal and hence explanatory efficacy slightly less ridiculous, but still unconvincing, and surely hard to square with empiricist principles. So, moving quickly on, let’s look at emergentist theories of language learning. Note that the discussion is mostly of Nick Ellis’ theory of emergentism, which he applies to SLA.

What Any Theory of SLA Must Explain

Kevin Gregg (1993, 1996, 2000, 2003) insists that any theory of SLA should do two things: (1) describe what knowledge is acquired (a property theory describing what language consists of and how it’s organised), and (2) explain how that knowledge is acquired (a causal transition theory ). Chomsky’s principles and parameters theory offers a very technical description of “Universal Grammar”, consisting of clear descriptions of grammar principles which make up the basic grammar of all natural languages, and the parameters which apply to particular languages. It describes what Chomsky calls “linguistic competence” and it has served as a fruitful property theory guiding research for more than 50 years. How is this knowledge acquired? Chomsky’s answer is contained in a transition theory that appeals to an innate representational system located in a module of the mind devoted to language, and by innate mechanisms which use that system to parse input from the environment, set parameters, and learn how the particular language works.

But UG has come under increasing criticism. Critics suggest that UG principles are too abstract, that Chomsky has more than once moved the goal posts, that the “Language Acquisition Device is a biologically implausible “black box”, that the domain is too narrow, and that we now have better ways to explain the phenomena that UG theory tackles. Increasingly, emergentist theories are regarded as providing better explanations.

Emergentist theories

There is quite a collection of emegentist theories, but we can distinguish between emergentists who rely on associative learning, and those who believe that “achieving the explanatory goals of linguistics will require reference to more just transitional probabilities” (O’Grady, 2008, p. 456). In this first post, I’ll concentrate on the first group, and refer mostly to the work of its leading figure, Nick Ellis. The reliance on associative learning leads to this group often being referred to as “empiricist emergentists”.

Empiricist emergentists insist that language learning can be satisfactorily explained by appeal to the rich input in the environment and simple learning processes based on frequency, without having to resort to abstract representations and an unobservable “Language Acquisition Device” in the mind.

Regarding the question of what knowledge is acquired, the emergentist case is summarised by Ellis & Wulff (2020, p. 64-65).

The basic units of language representation are constructions. Constructions are pairings of form and meaning or function. Words like squirrel are constructions: a form — that is, a particular sequence of letters or sounds — is conventionally associated with a meaning (in the case of squirrel, something like “agile, bushy-tailed, tree-dwelling rodent that feeds on nuts and seeds)”.

In Construction Grammar, constructions, are wide-ranging. Morphemes, idiomatic expressions, and even abstract syntactic frames are constructions:

sentences like Nick gave the squirrel a nut, Steffi gave Nick a hug, or Bill baked Jessica a cake all have a particular form (Subject-Verb-Object-Object) that, regardless of the specific words that realize its form, share at least one stable aspect of meaning: something is being transferred (nuts, hugs, and cakes).

Furthermore, some constructions have no meaning – they serve more functional purposes;

passive constructions, for example, serve to shift what is in attentional focus by defocusing the agent
of the action (compare an active sentence such as Bill baked Jessica a cake with its passive counterpart A cake was baked for Jessica).


constructions can be simultaneously represented and stored in multiple forms and at various levels of abstraction: table + s = tables; [Noun] + (morpheme -s) = “plural things”). Ultimately, constructions blur the traditional distinction between lexicon and grammar. A sentence is not viewed as the application of grammatical rules to put a number of words obtained from the lexicon in the right order; a sentence is instead seen as a combination of constructions, some of which are simple and concrete while others are quite complex and abstract. For example, What did Nick give the squirrel? comprises the following constructions:

• Nick, squirrel, give, what, do constructions
• VP, NP constructions
• Subject-Verb-Object-Object construction
• Subject-Auxiliary inversion construction

We can therefore see the language knowledge of an adult as a huge warehouse of constructions.

As to language learning, it is not about learning abstract generalizations, but rather about inducing general associations from a huge collection of memories: specific, remembered linguistic experiences.

The learner’s brain engages simple learning mechanisms in distributional analyses of the exemplars of a given form-meaning pair that take various characteristics of the exemplar into consideration, including how frequent it is, what kind of words and phrases and larger contexts it occurs with, and so on” (Ellis & Wulff, 2020, p. 66).

The “simple learning mechanisms” amount to associative learning. The constructions are learned through “the associative learning of cue-outcome contingencies” determined by factors relating to the form, the interpretation, the contingency of form and function; and learner attention. Language learning involves “the gradual strengthening of associations between co-occurring elements of the language”, and fluent language performance involves “the exploitation of this probabilistic knowledge” (Ellis, 2002, p. 173). Based on sufficiently frequent cues pairing two elements in the environment, the learner abstracts to a general association between the two elements.

Here’s how it works:

When a learner notices a word in the input for the first time, a memory is formed that binds its features into a unitary representation, such as the phonological sequence /wʌn/or the orthographic sequence one. Alongside this representation, a so-called detector unit is added to the learner’s perceptual system. The job of the detector unit is to signal the word’s presence whenever its features are present in the input. Every detector unit has a set resting level of activation and some threshold level which, when exceeded, will cause the detector to fire. When the component features are present in the environment, they send activation to the detector that adds to its resting level, increasing it; if this increase is sufficient to bring the level above threshold, the detector fires. With each firing of the detector, the new resting level is slightly higher than the previous one—the detector is primed. This means it will need less activation from the environment in order to reach threshold and fire the next time. Priming events sum to lifespan-practice effects: features that occur frequently acquire chronically high resting levels. Their resting level of activation is heightened by the memory of repeated prior activations. Thus, our pattern-recognition units for higher-frequency words require less evidence from the sensory data before they reach the threshold necessary for firing. The same is true for the strength of the mappings from form to interpretation. Each time /wʌn/ is properly interpreted as one, the strength of this connection is incremented. Each time /wʌn/ signals won, this is tallied too, as are the less frequent occasions when it forewarns of wonderland. Thus, the strengths of form-meaning associations are summed over experience. The resultant network of associations, a semantic network comprising the structured inventory of a speaker’s knowledge of language, is tuned such that the spread of activation upon hearing the formal cue /wʌn/ reflects prior probabilities of its different interpretations (Ellis & Wulff, 2020, p. 67).

The authors add that other additional factors need to be taken into account, and this one is particularly important:

..… the relationship between frequency of usage and activation threshold is not linear but follows a curvilinear “power law of practice” whereby the effects of practice are greatest at early stages of learning, but eventually reach asymptote.

Evidence supporting this type of emergentist theory is said to be provided by IT models of associative learning processes in the form of connectionist networks. For example, Lewis & Elman’s (2001) demonstration that a Simple Recurrent Network (SRN) can, among other things, simulate the acquisition of agreement in English from data similar to the input available to children, and the connectionist model reported in Ellis and Schmidt’s 1997 and 1998 papers is another.


There have been various criticisms of the empiricist version of emergentism as championed by Ellis, and IMHO, the articles by Eubank & Gregg (2002), and Gregg (2003) remain the most acute. I’ll use them as the basis for what follows.

a) Linguistic knowledge

Regarding their description of the linguistic knowledge acquired, Gregg (2003) points out that emergentists are yet to agree on any detailed description of linguistic knowledge, or even whether such knowledge exists. The doubt about whether or not there’s any such thing as linguistic knowledge is raised by extreme empiricists, such as the logical positivists and behaviourists discussed in my last post, and also the eliminativists involved in connectionist networks, who all insist that the only knowledge we have comes through the senses, representational knowledge of the sort required to explain linguistic competence is outlawed. Ellis and his colleagues don’t share the views of these extremists; they accept that linguistic representations – of some sort or other – are the basis of our language capacity, but they reject any innate representations, and therefore, they need to not just describe the organisation of the representations, but also to explain how the representations are learned from input from the environment.

O’Grady (2011) agrees with Gregg about the lack of consensus among emergentists as to what form linguistic knowledge takes; some talk of local associations and memorized chunks (Ellis 2002), others of a construction grammar (Goldberg 1999, Tomasello 2003), and others of computational routines (O’Grady 2001, 2005). Added to a lack of consensus is a lack of clarity and completeness. O’Grady’s discussion of Lewis & Elman’s (2001) Simple Recurrent Network (SRN), mentioned above, explains how it was able to mimic some aspects of language acquisition in children, including the identification of category-like classes of words, the formation of patterns not observed in the input, retreat from overgeneralizations, and the mastery of subject-verb agreement. However, O’Grady goes on to say that it raises the question of why the particular statistical regularities exploited by the SRN are in the input in the first place.

In other words, why does language have the particular properties that it does? Why, for example, are there languages (such as English) in which verbs agree only with subjects, but no language in which verbs agree only with direct objects?.

Networks provide no answer to this sort of question. In fact, if presented with data in which verbs agree with direct objects rather than subjects, an SRN would no doubt “learn” just this sort of pattern, even though it is not found in any known human language.

There is clearly something missing here. Humans don’t just learn language; they shape it. Moreover, these two facts are surely related in some fundamental way, which is why hypotheses about how linguistic systems are acquired need to be embedded within a more comprehensive theory of why those systems (and therefore the input) have the particular properties that they do. There is, simply put, a need for an emergentist theory of grammar. (O’Grady, 2011, p. 4).  

In conclusion, then, some leading emergentists themselves agree that emergentism has not, so far, offered any satisfactory description of the knowledge of the linguistic system that is required of a property theory. An unfinished construction grammar that is brought to bear on “a huge collection of memories, specific, remembered linguistic experiences”, seems to be as far as they’ve got.  

Associative learning

Whatever the limitations of the emergentists’ sketchy account of linguistic knowledge might be, their explanation of the process of language learning (which is, after all, their main focus) seems to have more to recommend it, not least its simplicity. In the case of empiricist emergentists, the explanation relies on associative learning: learners make use of simple cognitive mechanisms to implicitly recognise frequently-occurring associations among elements of language found in the input. To repeat what was said above, the theory states that constructions are learned through the associative learning of cue-outcome contingencies. Associations between co-occurring elements of language found in the input are gradually strengthened by successive encounters, and, based on sufficiently frequent cues pairing these two elements, the learner abstracts to a general association between them. To this simplest of explanations, a few other elements are attached, not least the “power law of practice”. In his 2002 paper on frequency effects in language processing, Ellis cites Kirsner (1994)’s claim that the strong effects of word frequency on the speed and accuracy of lexical recognition are explained by the power law of learning,

which is generally used to describe the relationships between practice and performance in the acquisition of a wide range of cognitive skills. That is, the effects of practice are greatest at early stages of learning, but they eventually reach asymptote. We may not be counting the words as we listen or speak, but each time we process one there is a reduction in processing time that marks this practice increment, and thus the perceptual and motor systems become tuned by the experience of a particular language (Ellis, 2002, p. 152).

Eubank & Gregg (2002, p. 239) suggest that there are many areas of language learning which the emergentist explanation can’t explain. For example:

Ellis aptly points to infants’ ability to do statistical analyses of syllable frequency (Saffran et al., 1996); but of course those infants haven’t learned that ability.  What needs to be shown is how infants uniformly manage this task:  why they focus on syllable frequency (instead of some other information available in exposure), and how they know what a syllable is in the first place, given crosslinguistic variation.  Much the same is true for other areas of linguistic import, e.g. the demonstration by Marcus et al. (1999) that infants can infer rules.  And of course work by Crain, Gordon, and others (Crain, 1991; Gordon, 1985) shows early grammatical knowledge, in cases where input frequency could not possibly be appealed to. .  All of which is to say, for starters, that such claims as that “learners need to have processed sufficient exemplars.” (p.40) are either outright false, or else true only vacuously (if “sufficient” is taken to range from as low a figure as 1).

Eubank & Gregg (2002, p. 240) also question emergentist use of key constructs. For example:

The Competition Model, for instance, relies heavily on the frequency (and reliability) of so-called “cues”.  The problem is that it is nowhere explained just what a cue is, or what could be a cue; which is to say that the concept is totally vacuous (Gibson, 1992).  In the absence of any principled characterization of the class of possible cues, an explanation of acquisition that appeals to cue-frequency is doomed to arbitrariness and circularity .  (The same goes, of course, for such claims as Ellis’s [p.54] that “the real stuff of language acquisition is the slow acquisition of form-function mappings,” in the absence of any criterion for what counts as a possible function and what counts as a possible form.)

In his (2003) article, Gregg has more to say about cues:   

The question then arises, What is a cue, that the environment could provide it? Ellis, for example, says, ‘in the input sentence “The boy loves the parrots,” the cues are: preverbal positioning (boy before loves), verb agreement morphology (loves agrees in number with boy rather than parrots), sentence initial positioning and the use of the article the)” (1998: 653). In what sense are these ‘cues’ cues, and in what sense does the environment provide them? What the environment can provide, after all, is only perceptual information, for example, the sounds of the utterance and the order in which they are made. (Emphasis added.) So in order for ‘ boy before loves’ to be a cue that subject comes before verb, the learner must already have the concepts subject and verb. But if subject is one of the learner’s concepts, on the emergentist view, he or she must have learned that; the concept subject must ’emerge from learners’ lifetime analysis of the distributional characteristics of the language input,’ as Ellis (2002a: 144) puts it (Gregg, 2003, p. 120).

Connectionist Models

Gregg (2003) goes to some length to critique the connectionist model reported in Ellis and Schmidt’s 1997 and 1998 papers. The model was made to investigate “adult acquisition of second language morphology using an artificial second language in which frequency and regularity were factorially combined” (1997, p. 149). The experiment was designed to test “whether human morphological abilities can be understood in terms of associative processes’ (1997, p. 145) and to show that “a basic principle  of  learning, the power  law of practice, also generates frequency by regularity  interactions” (1998, p. 309). The authors claimed that the network learned both the singular and plural forms for 20 nonce nouns, and also learned the ‘regular’ or ‘default’ plural prefix. In subsequent publications, Ellis claimed that the model gives strong support to the notion that acquisition of morphology is a result of simple associative learning principles and that the power law applies to the acquisition of morphosyntax. Gregg’s (2003) paper does a thorough job of refuting these claims.

Gregg begins by pointing out that connectionism itself is not a theory, but rather a method, “which in principle is neutral as to the kind of theory to which it is applied”. He goes on to point out the severe limitations of the Ellis and Schmidt experiment. In fact, the network didn’t learn the 20 nouns, or the 11 prefixes; it merely learned to associate the nouns with the prefixes (and with the pictures) – it started with the 11 prefixes, and was trained such that only one prefix was reinforced for any given word. Furthermore, the model was slyly given innate knowledge!   

Although Ellis accepts that linguistic representations – of some sort or other – are the basis of our language capacity, he rejects the nativist view that the representations are innate, and therefore he needs to explain how the representations are acquired. In the Ellis & Schmidt model, the human subjects were given pictures and sounds to associate, and the network was given analogous input units to associate with output units. But, while the human participants in the experiment were shown two pictures and were left to infer plurality (rather than, say, duality or repetition or some other inappropriate concept), the network was given the concept of plurality free as one of the input nodes (and was given no other concept). (Emphasis added.) Gregg comments that while nativists who adopt a UG view of linguistic knowledge can easily claim that the concept of plurality is innate, Ellis cannot do so, and thus he must explain how the concept of plurality has been acquired, not just make it part of the model’s structure. So, says Gregg, the model is “fudging; innate knowledge has sneaked in the back door, as it were”. Gregg continues:

Not only that, but it seems safe to predict that the human subjects, having learned to associate the picture of an umbrella with the word ‘broil’, would also be able to go on to identify an actual umbrella as a ‘broil’, or a sculpture or a hologram of an umbrella as representations of a ‘broil’. In fact, no subject would infer that ‘broil’ means ‘picture of an umbrella’. And nor would any subject infer that ‘broil’ meant the one specific umbrella represented by the picture. But there is no reason whatever to think that the network can make similar inferences (Gregg, 2003, p. 114).

Emergentism and Instructed SLA

Ellis and others who are developing emergentist theories of SLA stress that, at least for monolingual adults, the process of SLA is significantly affected by the experience of learning ones’ native language. Children learn their first language implicitly, through associative learning mechanisms acting on the input from the environment, and any subsequent learning of more lanaguages is similar in this respect. However, monolingual adult L2 learners “suffer” from the successful early learning of their L1, because the success results in implicit input processing mechanisms being set for the L1, and the knock-on effect is that the entrenched L1 processing habits work against them, leading them to apply entrenched habits to an L2 where they do not apply. Ellis argues that the filtering of L2 input to L1-established attractors leads to adult learners failing to acquire certain parts of the L2, which are referred to as its “fragile” features (a term coined by Goldin-Meadow, 1982, 2003). Fragile features are non-salient – they pass unnoticed – and they are identified as being one or more of infrequent, irregular, non-syllabic, string-internal, semantically empty, and communicatively redundant.

Ellis (2017) (supported by Long, 2015), suggests that teachers should use explicit teaching to facilitate implicit learning, and that the principle aim of explicit teaching should be to help learners modify entrenched automatic L1 processing routines, so as to alter the way subsequent L2 input is processed implicitly. The teacher’s aim should be to help learners to consciously pay attention to a new form, or form–meaning connection and to hold it in short-term memory long enough for it to be processed, rehearsed, and an initial representation stored in long-term memory. Nick Ellis (2017) calls this “re-setting the dial”: the new, better exemplar alters the way in which subsequent exemplars of the item in the input are handled by the default implicit learning process.

It’s interesting to see what Long (2015, p. 50) says in his major work on SLA and TBLT:

A plausible usage-based account of (L1 and L2) language acquisition (see, e.g., N.C. Ellis 2007a,b, 2008c, 2012; Goldberg & Casenhiser 2008; Robinson & Ellis 2008; Tomasello 2003), with implicit learning playing a major role, begins with initially chunk-learned constructions being acquired during receptive or productive communication, the greater processability of the more frequent ones suggesting a strong role for associative learning from usage. Based on their frequency in the constructions, exemplar-based regularities and prototypical morphological, syntactic, and other patterns – [Noun stem-PL], [Base verb form-Past], [Adj Noun], [Aux Adv Verb], and so on – are then induced and abstracted away from the original chunk-learned cases, forming the basis for attraction, i.e., recognition of the same rule-like patterns in new cases (feed-fed, lead-led, sink-sank-sunk, drink-drank-drunk, etc.), and for creative language use.

In sum, …….., while incidental and implicit learning remain the dominant, default processes, their reduced power in adults indicates an advantage, and possibly a necessity (still an open question), for facilitating intentional initial perception of new forms and form–meaning connections, with instruction (focus on form) important, among other reasons, for bringing new items to learners’ focal attention. Research may eventually show such “priming” of subsequent implicit processing of those forms in the input to be unnecessary. Even if that turns out to be the case, however, opportunities for intentional and explicit learning are likely to speed up acquisition and so becomes a legitimate component of a theory of ISLA, where efficiency, not necessity and sufficiency, is the criterion for inclusion.

It should be obvious from the earlier discussion above that I’m persuaded by the criticisms of Eubank, Gregg, O’Grady (and many others!) to reject empricist emergentism as a theory of SLA, and I confess to having felt surprised when I first read the quotation above. Never mind. What I think is interesting is that a different explanation of SLA – one which allows for innate knowledge, a “bootstrapping” view of the process of acquisition, and interlanguage development – has some important things in common with emergentism, which can be incorportated into a theory of ISLA (Instructed Second Language Acquisition). Such a theory needs to look more carefully at the effects of different syllabuses, materials and teacher interventions on students learning in different environments, in order to assess their efficacy, but I’m sure it will begin with the commonly accepted view among SLA scholars that, regardless of context, implicit learning drives SLA, and that explicit instruction can best be seen as a way of speeding up this implicit learning.


At the root of the problem of any empiricist account is the poverty of the stimulus argument. Gregg (2003, p. 101) summarises Laurence and Margolis’ (2001: 221) “lucid formulation” of it:

1. An indefinite number of alternative sets of principles are consistent with the regularities found in the primary linguistic data.

2. The correct set of principles need not be (and typically is not) in any pre-theoretic sense simpler or more natural than the alternatives.

3. The data that would be needed for choosing among those sets of principles are in many cases not the sort of data that are available to an empiricist learner.

4. So if children were empiricist learners they could not reliably arrive at the correct grammar for their language.

5. Children do reliably arrive at the correct grammar for their language.

6. Therefore children are not empiricist learners. 

By adopting an associative learning model and an empiricist epistemology (where some kind of innate architecture is allowed, but not innate knowledge, and certainly not innate linguistic representations), emergentists have a very difficult job explaining how children come to have the linguistic knowledge they do. How can general conceptual representations acting on stimuli from the environment explain the representational system of language that children demonstrate? I don’t think they can.

In the next post, I’ll discuss William O’Grady’s version of emergentism.  


Bates, E., Elman, J., Johnson, M., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1998).  Innateness and emergentism. In W. Bechtel & G. Graham (Eds.), A companion to cognitive science (pp. 590-601).  Basil Blackwell.

Ellis, N. (2002) Frequency effects in language processing: A Review with Implications for Theories of Implicit and Explicit Language Acquisition. Studies in SLA, 24,2, 143-188.

Ellis, N. (2015) Implicit AND Explicit Language Learning: Their dynamic interface and complexity. In Rebuschat, P. (Ed.). (2015). Implicit and explicit learning of languages, (pp. 3-23). Amsterdam: John Benjamins.

Ellis, N., & Schmidt, R. (1997). Morphology and longer distance dependencies: Laboratory Research Illuminating the A in SLA. Studies in Second Language Acquisition, 19(2), 145-171

Ellis, N. & Wulff, S. (2020) Usage-based approaches to l2 acquisition. In (Eds) VanPatten, B., Keating, G., & Wulff, S. Theories in Second Language Acquisition: An Introduction. Routledge.

Eubank, L. and Gregg, K. R. (2002) News Flash – Hume Still Dead. Studies in Second Language Acquisition, 24, 2, 237-248.

Gregg, K. R. (1993). Taking explanation seriously; or, let a couple of flowers bloom. Applied Linguistics 14, 3, 276-294.

Gregg, K. R. (1996). The logical and developmental problems of second language acquisition.  In Ritchie, W.C. and Bhatia, T.K. (eds.) Handbook of second language acquisition. Academic Press. 

Gregg, K. R. (2000). A theory for every occasion: postmodernism and SLA.  Second Language Research 16, 4, 34-59.

Gregg, K. R. (2001). Learnability and SLA theory. In Robinson, P. (Ed.) Cognition and Second Language Instruction.  CUP.

Gregg, K. R. (2003) The State of Emergentism in Second Language Acquisition.  Second Language Research, 19, 2, 95-128. 

O’Grady, W., Lee, M. & Kwak, H. (2011) Emergentism and Second Language Acquisition. In W. Ritchie & T. Bhatia (eds.), Handbook of Second Language Acquisition. Emerald Press.

O’Grady, W.(2011).  Emergentism. In Hogan, P. (ed). The Cambridge Encyclopedia of Language Sciences, Cambridge University Press.

Seidenburg, M. and Macdonald, M. (1997) A Probabilistic Constraints Approach to Language Acquisition and Processing. Cognitive Science, 23, 4, 569–588.

What Is Empiricism?


Emergentist theories of language learning are now so prevelent that their effects are being seen in the ELT world, where leading teacher educators refer to various emergentist constructs (e.g., priming, constructions, associative learning) and increasingly adopt what they take to be an emergentist view of L2 learning. Within emergentism, there is an interesting difference of opinion between those (the majority, probably) who follow the “input-based” or “empiricist” emergentist approach as proposed by Nick Ellis, and those who support the “processor” approach of William O’Grady. In preparation for a revised post on emergentism, I here discuss empiricism.

Rationalism vs Empiricism

In Discourse on Method, Descartes (1969 [1637]) describes how he examined a piece of wax. It had a certain shape, colour, and dimension. It had no smell, it made a dull thud when struck against the wall, and it felt cold. When Descartes heated the wax, it started to melt, and everything his senses had told him about the wax turned to its opposite – the shape, colour and dimensions changed, it had a pungent odour, it made little sound and it felt hot.  How then, asked Descartes, do I know that it is still a piece of wax?  He adopted a totally sceptical approach, supposing that a demon was doing everything possible to delude him. Perhaps it wasn’t snowing outside,  perhaps it wasn’t cold, perhaps his name wasn’t Rene, perhaps it wasn’t Thursday.  Was there anything that could escape the Demon hypothesis?  Was there anything that Descartes could be sure he knew?  His famous conclusion was that the demon could not deny that he thought, that he asked the question “What can I know?”  Essentially, then, it was his capacity to think, to reason, that was the only reliable source of knowledge, and hence Descartes’ famous “Cogito ergo sum”, I think, therefore I am. Descartes based his philosophical system on the innate ability of the thinking mind to reflect on and understand our world.  We are, in Descartes’ opinion, unique in having the ability to reason, and it is this capacity to reason that allows us to understand the world. 

But equally important to the scientific revolution of the early 17th century was the empirical method championed by Francis Bacon.  In The Advancement of Learning (Bacon, 1974 [1605]), Bacon claimed that the crucial issue in philosophy was epistemological, i.e., reliable knowledge, and proposed that empirical observation and experiments should be recognised as the way to obtain such knowledge. (Note that empirical observation means observations of things in the world that we experience through our senses, not to be confused with the epistemological view adopted by empiricists – see below.) Bacon’s proposal is obviously at odds with Descartes’s argument: it claims that induction, not deduction, should guide our thinking. Bacon recommends a bottom-up approach to scientific investigation: carefully conducted empirical observations should be the firm base on which science is built. Scientists should dispassionately observe, measure, and take note, in such a way that, step by careful step, checking continuously along the way that the measurements are accurate and that no unwarranted assumptions have crept in, they accumulate such an uncontroversial mass of evidence that they cannot fail to draw the right conclusions from it. Thus they finally arrive at an explanatory theory of the phenomena being investigated whose truth is guaranteed by the careful steps that led to it.

In fact, if one actually stuck to such a strictly empirical programme, it would be impossible to arrive at any general theory, since there is no logical way to derive generalisations from facts (see Hume, below). Equally, it is impossible to develop a rationalist epistemology from Descartes’ “Cogito ergo sum”, since the existence of an external world does not follow. In both cases, compromises were needed, and, in fact, more “practical” inductive and deductive processes were both used in the development of scientific theories, although we can note the differences between the more conservative discoverers and the more radical inventors and “big theory” builders, throughout the development of modern science in general, and in the much more recent and restricted development of SLA theory in particular. Larsen-Freeman and Long (1991), for example, talk about two research traditions in SLA: “research then theory”, and “theory then research”, and these obviously correspond to the inductive and deductive approaches respectively.    

In linguistics, the division between “empiricist” and “rationalist” camps is noteworthy for its incompatibility. The empiricists, who held sway, at least in the USA, until the 1950s, and whose most influential member was Bloomfield, saw their job as field work: accompanied with tape recorders and notebooks, the researcher recorded thousands of hours of actual speech in a variety of situations and collected samples of written text. The data was then analysed in order to identify the linguistic patterns of a particular speech community. The emphasis was very much on description and classification, and on highlighting the differences between languages. We might call this the botanical approach, and its essentially descriptive, static, “naming of parts” methodology depended for its theoretical underpinnings on the language learning explanation provided by the behaviourists.


Behaviourism was first developed in the early twentieth century by the American psychologist John B. Watson, who attempted to make psychological research “scientific” by using only objective procedures, such as laboratory experiments which were designed to establish statistically significant results. Watson (see Toates and Slack, 1990: 252-253) formulated a stimulus-response theory of psychology according to which all complex forms of behaviour are explained in terms of simple muscular and glandular elements that can be observed and measured.  No mental “reasoning”, no speculation about the workings of any “mind”, were allowed. Thousands of researchers adopted this methodology, and from 1920 until the 1950s an enormous amount of research on learning in animals and in humans was conducted under this strict empiricist regime. In 1950 behaviourism could justly claim to have achieved paradigm status, and at that moment B.F. Skinner became its new champion. Skinner’s contribution to behaviourism was to challenge the stimulus-response idea at the heart of Watson’s work and replace it by a type of psychological conditioning known as reinforcement (see Skinner, 1957, and Toates and Slack, 1990: 268 – 278).  Note the same insistence on a strict empiricist epistemology (no “reasoning”, no “mind”, no appeal to mental processes), and the claim that language is learned in just the same way as any other complex skill is learned – by social interaction.  

In sharp contrast to the behaviourists and their rejection of “mentalistic” formulations is the approach to linguistics championed by Chomsky.  Chomsky (in 1959 and subsequently), argued that the most important thing about languages was the similarities they shared, what they have in common, not their differences. In order to study these similarities, Chomsky assumed the existence of unobservable mental structures and proposed a “nativist” theory to explain how humans acquire a certain type of knowledge.  A top-down, rationalist, deductive approach is evident here.        

The Empiricists

But let’s return to empiricism. In the second half of the eighteenth century, a new movement in philosophy, known as empiricism, appeared, the most influential proponents being Locke, Mill, and Hume. In a much more radical, more epistemologically-formulated, statement of Bacon’s views, the British empiricists argued that everything the mind knows comes through the senses. As Hume put it:  “The mind has never anything present to it but the perceptions.” (Hume, 1988 [1748]: 145). Starting from the premise that only “experience” (all that we perceive through our senses) can help us to judge the truth or falsity of factual sentences, Hume argued that reliable knowledge of things was obtained by observing the relevant quantitative, measurable data in a dispassionate way. This is familiar territory – Bacon again, we might say – but the argument continues in a way that has dire consequences for rationalism. 

If, as Hume claims, knowledge rests entirely on observation, then there is no basis for our belief in natural laws: we believe in laws and regularities only because of repetition. For example, we believe the sun will rise tomorrow because it has repeatedly done so every 24 hours, but the belief is an unwarranted inductive inference. As Hume so brilliantly insisted, we can’t logically go from the particular to the general: it is an elementary, universally accepted tenet of formal logic that no amount of cumulative instances can justify a generalisation. No matter how many times the sun rises in the East, or thunder follows lightening, or swans appear white, we will never know that the sun rises in the East, or that thunder follows lightning or that all swans are white. This is the famous “logical problem of induction”. To be clear, the empiricists don’t claim that we have empirical knowledge – they limit themselves to the claim that knowledge can only be gained, if at all, by experience. And if the rationalists are right to claim that experience cannot give us knowledge, the conclusion must be that we do not know at all. Hume’s position with regard to causal explanation is the same: such explanations can’t count as reliable knowledge, they are only presupposed to be true in virtue of a particular habit of our minds.

The positivists tried to solve Hume’s devastating critique.


Positivism refers to a particularly radical form of empiricism. Comte invented the term, arguing that each branch of knowledge passes through “three different theoretical states: the theological or fictitious state; the metaphysical or abstract state; and, lastly, the scientific or positive state.” (Comte, 1830, cited in Ryan, 1970:36)  At the theological stage, the will of God explains phenomena, at the metaphysical stage phenomena are explained by appealing to abstract philosophical categories, and at the scientific stage, any attempt at absolute explanations of causes is abandoned.  Science limits itself to how observational phenomena are related. Mach, the Austrian philosopher and physicist, headed the second wave, which rooted out the “contradictory” religious elements in Comte’s work, and took advantage of the further progress made in the hard sciences to insist on purging all metaphysics from the scientific method (see Passmore, 1968: 320-321).

The third wave of positivists, whose members were known as The Vienna Circle, included Schlick, Carnap, Godel, and others, and had Russell, Whitehead and Wittgenstein  as interested parties (see Hacking, 1983: 42-44). They developed a programme based on the argument that true science could only be achieved by:

  1. Completely abandoning metaphysical speculation and any form of theology. According to the positivists such speculation only proposed and attempted to solve “pseudo-problems” which lacked any meaning since they were not supported by observable, measurable, experimental data.
  2. Concentrating exclusively on the simple ordering of experimental data according to rules. Scientists should not speak of causes: there is no physical necessity forcing events to happen and all we have in the world are regularities between types of events. There is no room in science for unobservable or theoretical entities .

The programme was a complete fiasco, none of the objectives were realised, and the movement disbanded in the 1930s. “Positivism” in general, and as expounded in the writings of the Vienna Circle in particular, is, in my opinion, a good example of philosophers stubbornly marching up a blind alley. It’s a fundamentally mistaken project, as Popper (1959) demonstrated, and as Wittgenstein (1933) himself recognised. We may note that critics of psycholinguistic theories of SLA who label their opponents “positivists”, are either ignorant of the history of positivism or making a strawman case against what they consider to be a mistakenly “scientific” approach to research. We may also note that empiricism as an epistemological system, if taken to its extreme, leads to a dead end of radical scepticism and solipsism. Therefore, when looking at current discussions among scholars of SLA, it’s of the utmost importance to distinguish between a radical empiricist epistemology on the one hand, and an appeal to empirical evidence on the other.  

The start of the psycholinguistic study of SLA

To conclude, we’ll look briefly at how behaviourism was superseded by Chomsky’s UG, thus ending – for a while anyway! – the hold that empiricism had enjoyed over theories of language learning.  

Chomsky’s  Syntactic Structures (1957), followed by his review in 1959 of Skinner’s Verbal Behaviour (1957), marked the beginning of probably the fastest, biggest, most complete revolution in science that had been seen since the 1930s. Before Chomsky, as indicated above, the field of linguistics was dominated by a Baconian, empiricist methodology, where researchers saw their job almost exclusively as the collection of data.  All languages were seen as composed of a set of meaningful sentences, each composed of a set of words, each in turn composed of phonemes and morphemes. Each language also had a grammar which determined the ways in which words could be correctly combined to form sentences, and how the sentences were to be understood and pronounced. The best way to understand the over 2,500 languages said to exist on earth was to collect and sort data about them so that eventually the patterns characterising the grammar of each language would emerge, and that then, interesting differences among different languages, and even groups of languages, would also emerge. 

Chomsky’s revolutionary argument (Chomsky, 1957, 1965, 1986) was that all human beings are born with innate knowledge of grammar – a fixed set of mental rules that enables young children to relatively quickly understand the language(s) they’re exposed to and to create and utter sentences they’ve never heard before. Language consists of a set of abstract principles that characterise the core grammars of all natural languages, and learning language is simplified by reliance on an innate mechanism that constrains possible grammar formation. Children don’t have to learn key, universal features of the particular language(s) to which they are exposed because they know them already. The job of the linguist was now to describe this generative, or universal, grammar, as rigorously as possible.

The arguments for Universal Grammar (UG) start with the poverty of the stimulus argument: young children’s knowledge of their first language can’t be explained by appealing to the actual, attested language they are exposed to. On the basis of the input young children get, they produce language which is far more complex and rule-based than could be expected, and which is very similar to that of other adult native speakers of the same language variety, at an age when they have difficulty grasping abstract concepts. That their production is rule-based and not mere imitation, as the behaviourist view held, is shown by the fact that they frequently invent unique,  well-formed utterances of their own. That they have an innate capacity to discern well-formed utterances is supported by evidence from tens of thousands of studies (see, for example, Cook & Newson, 1996).

I won’t continue a “defence of UG”. Suffice it to say that Chomsky’s work inspired the development of a psycholinguistic approach which saw L2 learning as a process going on in the mind. Beginning with error analysis and the morpheme studies, this cognitive approach made uneven progress, but Selinker’s (1972) paper, arguing that L2 learners develop their own autonomous mental grammar (interlanguage grammar) with its own internal organising principles is an important landmark. I’ve done a series of posts on all this, Part 8 of which discusses emergentist theories. As indicated, I’m not happy with Part 8, and in the next post, I’ll offer a revised version, where Nick Ellis’ “empiricist” emergentism and William O’Grady’s “mentalist” emergentism will be discussed.   


Bacon, F. 1974 [1605] The Advancement of Learning: New Atlantis.  Ed. A. Johnston. Claredon.

Chomsky, N. 1957: Syntactic Structures. The Hague: Mouton.

Chomsky, N.  1965: Aspects of the theory of syntax. Cambridge, Mass.: MIT Press.

Chomsky, N. (1976). Reflections on Language. Temple Smith.

Cook, V. J. & Newson, M. (1996). Chomsky’s Universal Grammar: An Introduction. Blackwell.

Descartes, R. 1969 [1637]. Discourse On Method. In Philosophical Works of Descartes, Volume 1. Trans. E. Haldane and G. Ross.  Cambridge University Press.

Ellis, N. C. (2011). The emergence of language as a complex adaptive system. In J. Simpson (Ed.), Handbook of Applied Linguistics (pp. 666–79), Routledge.

Hacking, I. (1983). Representing and Intervening.  Cambridge University Press

Hume, D. 1988 [1748] An Enquiry Concerning Human Understanding.  Promethius.

Larsen-Freeman, D. & Long, M. H. (1991). An introduction to second language acquisition research. Longman.

Popper, K. R. (1959). The Logic of Scientific Discovery.  Hutchinson.

Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics 10, 209-231.

Skinner, B. F. (1957). Verbal behavior.  Appleton-Century-Crofts.

Toates, F. and Slack, I. (1990). Behaviourism and its consequences.  In Roth, I. (ed.) Introduction to Psychology. Psychology Press. 

Wittgenstein, L. (1953) Philosophical Investigations. Translated by G.E.M. Anscombe. Basil Blackwell.

How difficult is reactive feedback?


In my previous post about Dellar’s comments on Dogme, I suggested that there are two big differences between using the Dogme approach and using a coursebook. First, in Dogme lessons there’s no planned explicit teaching of grammar or pronunciation, and second, there’s a lot more classroom time spent on engaging students in spontaneous conversation and oral exchanges. These differences are the result of fundamental differences in two distinct approaches to ELT, which I’ve discussed in lots of posts in this blog. They turn on the use of synthetic and analaytic syllabuses.

Coursebooks implement a synthetic syllabus, where the role of speaking activities is to consolidate the explicit teaching of pre-selected language “items” which are the focus of each Unit of the coursebook. For example, if the second conditional and tourism are among the items covered in Unit 3, then one of the speaking activities in that unit might be to discuss how you’d react to missing your flight, or losing your passport, or having your credit card declined in a restaurant. The speaking activity is designed to practice the pre-taught “second conditional” and vocabulary about tourism. The speaking activity doesn’t last long, and the language that students are expected to use is as predictable as it is unlikely to happen – “If my credit card was/were declined,  I’d pay cash”, for example, is an unlikely student utterance.

I n the post, I argued that Dellar’s description of “what teachers need to be doing” while students engage in the tasks that typify a Dogme lesson is based on a misunderstanding of the extended communicative tasks that lie at the heart of Dogme, where the assumption (made by those who use an analytic syllabus) is that most of the learning goes on implicitly while students do the task, and thus, the attention to explicit teaching that he demands is largely unnecessary. However, I added that reactive feedback (what Long calls “focus on form”) done during the task, and a feedback session after the task, were important parts of the Dogme approach. 

A Twitter Exchange

The post sparked comments on Twitter about giving reactive feedback which took me rather by surprise. Matt Bury tweeted:   

I think one criticism is valid: It puts more emphasis on dynamic scaffolding (highly demanding) & less on planned scaffolding (minimally demanding) so considering the high teaching workload that most ELTs work under…

He continued:

I mean that dynamic scaffolding (i.e. During the lesson, Dellar’s “big ask”) is more demanding than planned scaffolding (i.e. Coming into the classroom having thought through the scaffolding your students will need to successfully use the language).  Planned scaffolding also enables ELTs to re-use it (systematically recorded & adaptable resources) & plan more strategically across several lessons at a time, thereby increasing the quality of instruction.

And he concluded:

In the long run, planned scaffolding tends to be more efficient, more effective & less work than dynamic scaffolding.

Chris Jones later made a similar point. He tweeted:

Nothing against Dogme but you assume teachers at any level of experience and in any situation can easily tune in to what students say, pick up on errors, decide where to focus and then provide recasts. I don’t think that is easy or actually realistic in many cases.

Asked to clarify “tune in” Jones replied:

Yes, tune in to the language students are using in order to help them with it. Are they not coming to class for that? Even if this way were practical, you know that you need experience to have some idea of how to scaffold and what to recast and what to leave.

Scott Thornbury chipped in:

Fair point Chris. But you don’t get good at it if you don’t try it.

to which Sue Leather responded:

Yes, that’s true. But… though I’m very much a supporter of dogme, as a trainer I have found it a hard sell in certain (cultural) contexts. Perhaps when teachers don’t have full confidence in their own English and/or pedagogic skills….

My initial reaction to these comments was one of surprise. I had assumed that teachers could learn how to give reactive feedback to students while they’re engaged in tasks, and how to conduct a subsequent feedback session quite quickly, without much difficulty. Well, it seems I might have seriously underestimated the difficulties. Let me explain my view.

My view of reactive feedback

In order to do the Dogme approach, teachers need to understand the purpose of reactive feedback (or “focus on form” (Long, 2015, pp. 27-28) as it is widely referred to) and to appreciate that it’s a radical alternative to “focus on forms”. Focus on forms makes the explicit teaching of a pre-selected series of linguistic items the main content of the syllabus. In contrast, Dogme adopts a “focus on form” approach which involves the brief, reactive use of a variety of pedagogic procedures, ranging from recasts to provision of simple grammar “rules” and explicit error correction, designed to draw learners’ attention, in context, to target items that are proving problematic. Focus on form has as its objective to draw students’ attention to items which they might otherwise neither detect nor notice for a long time, thereby speeding up the learning process. Furthermore,  following the research of Nick Ellis (2005, 2006) and colleagues who adopt an emergentist theory of SLA, focus on form can create and store a first impression, or “trace” as Ellis calls it, of the item in long-term memory, thereby increasing the likelihood that it will subsequently be detected when examples are encountered during subsequent implicit input processing.

Focus of form is one of the principal ways that an analytic syllabus deals with formal aspects of the L2. An analytic syllabus expects learners to work out how the target language works for themselves, through exposure to input and using the language to perform communicative tasks. There is no overt or covert linguistic syllabus; more attention is paid to message and pedagogy than to language. The assumption, supported by research findings in SLA, is that, much in the way children learn their L1, adults can best learn a L2 implicitly, through using it. Analytic syllabuses are implemented using spoken and written activities and texts, modified for L2 learners, chosen for their content, interest value, and comprehensibility. Classroom language use is predominant, while grammar presentations and drills are seldom employed.

So, the first step for teachers wanting to adopt a Dogme approach is to share the view of most SLA scholars that language learning is predominantly a matter of learning by doing, of implicit learning, and that, therefore, reactive feedback should play an important, but relatively minor, role in L2 learning. Now, I appreciate that most second language teacher education (SLTE) today, particularly pre-service “training” courses like CELTA, fails to give proper attention to how people learn an L2, and that, as a result, most teachers are unaware of the prime importance of implicit learning and of how the efficacy of explicit teaching is determined by the learners’ readiness to learn, i.e., by the current state of their dynamic interlanguage trajectory. Most pre-service teachers are taught how to use coursebooks, and, as a result, they’re encouraged to wrongly assume that explaining bits of the L2 is a necessary prior step to practicing them. The solution to this dire problem is obvious – a module devoted to how people learn an L2 should be a necessary part of SLTE.

Once teachers understand the psychological process involved in L2 learning, and the role of reactive feedback, they need to get the hang of using it. As I said in the previous post, Dellar’s list of what teachers “need to be doing” while students carry out a Dogme task is based on ignorance of the SLA literature and on what he thinks teachers should do during the speaking activities found in coursebooks. In fact, teachers with an understanding of how people learn an L2, who consequently opt to use an analytic syllabus, including Dogme and many TBLT syllabuses, rarely find it difficult to get students speaking; or to move from group to group listening in on what they’re saying, or to notice gaps in their language. And they don’t, pace Dellar, have to think about what they’re going  to gap on the board, or what questions they’re going to ask about it, or how they’re going “to get the students to use some of that language”.

The Problem

Nevertheless, during the task, teachers have to use a variety of pedagogic procedures in reaction to breakdowns in communication and certain errors, and they have to lead feedback sessions afterwards.

So how difficult do teachers find these pedagogic procedures?

The problem here is that, as Chris Jones said in his tweets, there’s very little empirical evidence from studies of Dogme classes to help us answer that question. He’s quite right, but In disagreement with Jones, I think that teachers implementing many TBLT syllabuses (including the types described by Long, Skehan, N. Ellis, R. Ellis, Robinson and Dave and Jane Willis  – see Ellis & Shintanti (2016) for a review) engage in the same kind of reactive feedback and feedback sessions as Dogme teachers, and thus we do have considerable evidence from studies on the effects of these pedagogic interventions in TBLT. Nevertheless, we don’t have much evidence about how difficult teachers feel it is to do this kind of teaching.

I’ve taught English as an L2 for over thirty years, rarely used a coursebook, and I don’t remember ever thinking that giving reactive feedback or leading feedback sessions was any more difficult than other elements of the job. That’s mostly because I taught in contexts where I got a lot of support from bosses (who, in the 1980s and early 1990s, organised and/or paid for courses on using analytic syllabuses) and from colleagues who participated with me in on-gong CPD where we honed the skills needed to do learner-centred teaching where explicit (grammar) teaching took a back seat. Those years are long gone. As I said above, ELT is currently dominated by coursebooks, with the result that teachers don’t get the training, practice and support they need to switch to a different type of teaching.  

The Answer   

I think the best way for teachers to learn how to incorporate reactive feedback and follow-up feedback sessions into their teaching is to read up on it (I recommend the section on corrective feedback in Ellis & Shintani (2016) Exploring Language Pedagogy through Second Language Acquisition Research as a good starting place), watch experienced teachers, and then get experience doing it, ideally with the support of colleagues. Of course it takes time and practice to get good at making brief interventions during a task, at taking notes of interesting bits of emerging language, and at using these notes to lead a follow-up feedback session. But from what Scott Thornbury tells me about the teachers who’ve done Dogme courses, and from what I know about the teachers who’ve done TBLT courses, they learn fast, they feel it’s worth the effort, and they feel that it helps them to become more effective teachers.   


Scott Thornbury makes a point (particularly when discussing ELT with me, I can assure you!) of emphasizing just how much context affects how we teach. He’s right to do so, but I think he’d agree with me that arguments suggesting that certain contexts are not “suitable” for Dogme or TBLT are mostly bogus. It is simply not true that Dogme or TBLT are not “appropriate” for certain cultural contexts, or for certain government regimes, or for big classes, or for certain types of learners – beginners, young learners, the elderly, etc. Many studies (see, e.g., the meta-analyses of  Cobb (2010) and Bryfonski, & McKay (2019)), show that TBLT gets excellent results in a very wide variety of contexts, and it seems to me reasonable to argue that Dogme and TBLT can be adapted to any context.

“Non-native” Teachers

Sue Leather’s suggestion that Dogme meets resistance among “teachers who don’t have full confidence in their own English and/or pedagogic skills” is, I think, extremely important. I should preface this discussion by making it clear that I condemn any discrimination against teachers of English as an L2 based on the fact that English is not their L1. There are countless great teachers of English as an L2 whose L1 is not English, and I support onging attempts to outlaw any school or university which demands that teachers of English as an L2 have English as their L1. I’ll now discuss the problem of the many non-native speaker teachers whose lack of proficiency affects their teaching, summarising part of Chapter 10 of Jordan & Long (2022).   

More than 90% of those currently teaching English as a foreign language are non-native English speakers (British Council, 2015). Most non-native English speaker teachers work in their own countries, where the government’s Ministry of Education produces a curriculum and stipulates the entry level qualifications required to work as an English teacher.

In China, for example, the Chinese Ministry of Education launched a nationwide BA program in TEFL in 2003 which became the recognized Pre-Service English Teacher Education program for those wishing to teach English as a Foreign language in primary, secondary and tertiary education in China. Studies by Zhan (2008), Hu (2003, 2005) and Yan (2012) revealed that many of the student teachers had considerable difficulties in expressing themselves clearly and fluently in English, and that their lack of confidence in speaking English contributed significantly to the subsequent ‘mismatch’ between the objectives of the course and the ways student teachers subsequently did their jobs in their local contexts. The course’s promotion of communicative language teaching failed to change the type of teaching the student teachers subsequently carried out: in their classrooms “the tyranny of the prescribed textbook” was still in evidence (Zhan, 2008, p. 62).  Studies by (Hu (2003) and Yan, (2012) support the general view that, despite being told of the value of CLT, and despite stating in their answers to researchers’ questions that they firmly believed in the value of communicative activities, when the teachers’ classes were observed, it became obvious that their lessons were teacher-fronted, and that the vast majority of the time was spent using a coursebook to instill knowledge about English grammar and vocabulary.

Similar results were found in studies carried out in other countries. Regarding the language problem, a 1994 study by Reves & Medgyes asked 216 native speaker and non-native speaker English teachers from 10 countries (Brazil, former Czechoslovakia, Hungary, Israel, Mexico, Nigeria, Russia, Sweden, former Yugoslavia, and Zimbabwe) about their experiences as teachers. The overwhelming majority of participants were non-native speakers of English, and in their responses, 84% of the non-native speaker subjects said that they had serious difficulties using English and that their teaching was adversely affected by these difficulties. Difficulties with vocabulary and fluency were most frequently mentioned, followed by pronunciation, and listening comprehension. It’s ironic that the  problem goes back to failures in their own teachers’ ability to implement a CLT approach.

Cultural factors play their part in how teachers go about their job, of course, but I reject the suggestion that Chinese teachers, for example, are so imbued with a Zen cultural heritage that they find it impossible to abandon centuries-old teaching practices. Appealing to cultural stereotypes in this way is surely offensive and ignores the real experiences of Chinese teachers, many of whom sincerely desire change. Likewise, in other parts of the world where a mismatch between pre-service English teaching courses and outcomes has been found, explanations which stress differences in cultures and in teachers’ subjective ‘knowledge bases’ fail to give enough attention to the restraints imposed by objective factors, including that many teachers lack confidence in their English.

Just to complete the picture, we should appreciate that the state-run English teaching courses offered in China and elsewhere are based on interpretations of ideas about CLT which stem not so much from the ideas which emerged in the 1970s, but rather from more recent ideas, promoted by those working for commercial ELT companies who all work to maximize profits, and who are all therefore keen to package ELT into a number of marketable products. The CELTA course is a good example: it is an easily marketable, highly profitable product in itself, and it involves the use of other, related, well-packaged, marketable products, including coursebooks and exams. It is only to be expected that SLTE courses designed by Cambridge English should encourage coursebook-driven ELT, and it is equally predictable that the British Council, with its own chain of English language schools and close ties with Cambridge English, should do the same. Likewise, when overseas ministries of education turn to Cambridge English and other such providers for help in introducing a communicative approach to ELT, it is to be expected that these providers recommend using their own products, coursebooks and tests among them. We could hardly expect them to encourage the implementation of Dogme or TBLT! Furthermore, we cannot expect the Chinese Ministry of Education (or the Turkish, or Vietnamese, or Brazilian, etc., etc., ministries) to appreciate the differences between “real” CLT and what the British Council, Cambridge University Press, and others say it is.


It’s been salutary for me to read the comments made on Twitter by Matt Bury, Chris Jones and others which highlight the difficulties of pedagogic procedures which I had assumed weren’t particularly challenging. There’s no doubt that we need more research investigating how teachers actually carry out reactive feedback and follow-up feedback sessions, and that more attention should be paid to Instructed Second Language Acquisition. We also need more discussion among teachers, and I think the Twitter exchanges show that, despite their limitations, they can be very useful in raising important concerns.

I end with Matt’s final comment:

 To be clear, I share Geoff Jordan’s criticisms of how ELT coursebooks are typically designed; I think the gap between Instructed SLA theory vs coursebook instruction couldn’t feasibly be bigger.


British Council (2015). The English Effect Report. Retrieved March 15, 2021 from

Bryfonski, L., & McKay, T. H. (2019). TBLT implementation and evaluation: A meta-analysis. Language Teaching Research, 23(5), 603–632.

Cobb, M. (2010). Meta-analysis of the effectiveness of task-based interaction in form-focused instruction of adult learners in foreign and second language teaching. Doctoral Dissertations. 389.

Ellis, R. & Shintani. N. (2016). Exploring Language Pedagogy through Second Language Acquisition Research. Routledge.

Hu, G. (2003). English Language Teaching in China: Regional Differences and Contributing Factors. Journal of Multilingual and Multicultural Development, 24, 4, 290–318

Hu, G. (2005). English language education in China: Policies, progress, and problems. Language policy, 4, 5-24.

Jordan, G. & Long, M. (2022). ELT: Now and How It Could Be. Cambridge Scholars.

Long, M. (2015). SLA and TBLT. Wiley.

Reves,T. & Medgyes, P. (1994). The non-native English speaking EFL/ESL teacher’s self-image: An international survey. System, 22, 3, 353-367.

Yan, C. (2012). ‘We can only change in a small way’: A study of secondary English teachers’ implementation of curriculum reform in China. Journal of Educational Change, 13, 431 – 447.

Zhan, S. (2008). Changes to a Chinese pre‐service language teacher education program: analysis, results and implications, Asia‐Pacific Journal of Teacher Education, 36, 1, 53-70.

Is Dogme “really bloody difficult”?

I’ve just come across a video by Dmitriy Fedorov called Teaching Unplugged: Scott Thornbury versus Hugh Dellar where we see both Thornbury and Dellar talking about Dogme. Federov ends by siding with Dellar’s view, which is that teaching unplugged is “a huge ask” and “really bloody difficult”. The Dellar clip used in Fedorov’s video is from a 2017 presentation: Teaching House Presents – Hugh Dellar on Speaking, during which he offers a short rant against Dogme as evidence to support his view that speaking activities need careful preparation, including anticipating what students are likely to say, so as to avoid being caught “on the spot”, unable to offer the required pedagogic support. I’ll argue that Dellar’s “evidence” puts the limitations of his own view of ELT on show, and unfairly dismisses Dogme.

Here’s the clip. To start, click on the arrow, and to stop, click anywhere inside the video frame.

Here’s a transcript:

In Dogme teaching you’re kind of working from what the students say.

Seems a lovely idea, but it’s really bloody difficult to do because what you need to be doing is

  1. getting the students speaking
  2. listening to them all as they’re speaking and wandering around cajoling those who aren’t speaking
  3. noticing gaps in their language
  4. thinking about how to say those things better in a more sophisticated way
  5. getting that language on the board while they’re still talking
  6. thinking about what you’re going to gap on the board, what questions you’re going to ask about it, how you’re going to get the students to use some of that language

And you’re going to have to do all of that on the spot. It’s a huge ask and it’s one of the reasons why Dogme doesn’t exist outside of Scott Thornbury’s head.   

Let’s start at the end. Dellar’s jokey remark that Dogme doesn’t exist outside Scott Thornbury’s head was made five years ago, by which time Dogme was already famous. Today, a Google search on “Dogme and language teaching” gives approx. 327.000 results in half a second. Thousands of articles, blog posts, podcasts, and discussion groups attest to the growing popularity of Dogme among language teachers around the world. Among the first fifty results of the Google search I did, I found these:

  • Nguyen & Bui Phu’s (2020) article “The Dogme Approach: A Radical Perspective in Second Language Teaching in the Post-Methods Era”, which gives an interesting discussion of Dogme,
  • Coşkun’s (2017) article “Dogme: What do teachers and students think?”, which presents the findings of a study carried out at a Turkish school exploring the reactions of EFL teachers and their students to three experimental Dogme ELT lessons prepared for the study. Coşkun’s study includes detailed accounts of the lessons, and it serves to highlight the poverty of Dellar’s description of Dogme as “kind of working from what the students say”.

It’s important to note that, in a 2020 interview, Thornbury said he thought it had been a mistake to make conversation part of the  three pillars of Dogme. “What really should be said, is that Dogme is driven not by conversations, but by texts… texts meaning both written and spoken”. Meddings and Thornbury have made clear in a number of publications and interviews that the Dogme approach does not, pace Dellar, involve teachers strolling unprepared into class and asking students what they fancy talking about. It involves planning and extensive use of a wide variety of oral/ written / multimodal texts, some created by the students and some provided by the school. It also includes a lot of attention to different kinds of feedback, including attention to vocab., lexical chunks, pronunciation and grammar.

Dellar’s dismissal of Dogme as too bloody difficult stems from viewing it through the distorting lens of his own approach to teaching. He thinks that if teachers don’t have a coursebook to lean on, a coursebook that organises speaking activities around pre-selected bits of the language, provides lead-ins and warm-ups and post-speaking follow-up and consolidation work, then they’ll have to do all this stuff “on the spot” – which is, he thinks, “a huge ask”. In their book “Teaching Lexically”, Dellar & Walkley recommend working with a coursebook – one of the Outcomes series, for example – which provides a syllabus made up of activities designed to teach pre-selected bits of language (“items” as they call them) in a pre-determined sequence. The teacher uses the coursebook to lead students through multiple activities in each lesson, few of them lasting for more than 20 minutes and even fewer giving students opportunities to talk to each other at any length. The English language is presented to learners, bit-by-bit. via various types of grammar and vocab. summary boxes, plus carefully-designed oral and written texts. Activities include studying these language summaries, comprehension checks, fill-the-gap, multiple choice and matching exercises, pattern drills, and carefully-monitored speaking activities. The special thing about Dellar & Walkley’s coursebooks is that they pay particular attention to lexis, collocations and lexical chunks, and the special thing about Dellar is that he’s particularly enthusiastic about explicitly teaching as many lexical chunks as possible. The upshot of this approach is that the majority of classroom time is devoted to explicit teaching, i.e., to the teacher telling the students about the target language.

Dellar treats education as the transmission of information, a traditional view which is challenged by the principles of learner-centred teaching, as argued by educators such as Paul Friere, and supported in the ELT field by Thornbury, Meddings and other progressive educators. Compare this transmission of information view of education (the “banking” view as Friere called it) to the Dogme approach, where education involves learning by doing. Dynamic interaction among the teacher and the students and the negotiation of meaning are key aspects of language teaching. Students often chose for themselves the topics that they deal with and they contribute and create their own texts; most of classroom time is given over to tasks which involve using the language communicatively and spontaneously; the teacher reacts to linguistic problems as they arise rather than introducing, explaining and practicing pre-selected bits of the language.   

Dogme teachers reject the view that each lesson should specify in advance what items of the language will be taught, and they reject the view that some explanation of the new items is a necessary first step. Instead, they use a task -> feedback sequence, where working through multi-phased communicative tasks involves pair and group work which takes up at least half of classroom time. The unplanned language that emerges during the interaction among students and teachers as they work through tasks includes errors and communication breakdowns. Teachers use recasts and other types of punctual intervention to help students express themselves, and they subsequently provide more lengthy, explicit information about the lexis and the pronunciation and grammar issues which arose during the task peformance in the feedback session.

An example of a Dogme lesson is given in Meddings & Thornbury (2009, p. 41)

Slices of life

  1. Teacher draws a pie chart on the board and splits it in three: like, don’t like, don’t mind.
  2. Students ask teacher about their likes / dislikes. Teacher replies and students put things into the three categories depending on the response.
  3. Students then work in pairs repeating the same activity with each other, while the teacher moves around from one pair to the next, helping students with their language.
  4. The whole class comes together and different students’ likes and dislikes are compared.
  5. Teacher gives language feedback.

Also, see Coskun’s (2017) description of the 3 lessons involved in her study (the article is free to download).  

If we look again at the list of all the things that Dellar thinks a teacher “needs to be doing” when “working from what the students say”, it clearly reflects his belief in the importance of explicit instruction; it indicates that he’s thinking about the sort of speaking activities you find in coursebooks like Outcomes; and it suggests that he has little grasp of what a Dogme approach entails. Why should it be so difficult to deal with students speaking? When the class is together – in Part 2 of the Slices of life task above, for example – students speak one at a time, and the teacher can deal quickly with language problems as they arise, through prompts and recasts, putting new vocabulary and short grammar notes on the board. During pair or group work – Parts 3 & 4 of the example – the teacher moves from group to group, listening in, giving help with vocabulary and pronunciation, and making some quick comments on errors – through recasts, for example. The teacher takes notes of useful vocabulary and of pronunciation and grammar issues and these can be written on the board while the students finish up their discussion by going back over the main points. When the whole class comes back together to report on how they did the task and discuss their different likes and dislikes – Part 5 – the teacher reacts to what they say as in Part 2. In the final part of the lesson, the teacher goes through the points that have been highlighted during the session and makes a few final remarks.

There is a fast-growing collection of literature contradicting Dellar’s insinuation that a Dogme approach makes unreasonable demands on teachers. The evidence shows that increasing numbers of teachers find the Dogme approach not just more stimulating and enjoyable than using a coursebook, but also less complicated and less stressful.  When their students are engaged in a communicative task, Dogme teachers don’t report getting stressed out trying to think of “better”, “more sophisticated” ways of expressing what the students are saying, because they don’t share Dellar’s dedication to explicit teaching. While students work together in groups talking about a problem or topic they’ve been asked to discuss, Dogme teachers don’t wander around the classroom trying to think of the most appropriate language to fill the gaps they’ve noticed, or what gapped sentences they should write on the board, or what questions they should ask, or how they’re going to get the students to use the language they come up with. In other words, while the students are doing a task, Dogme teachers are not doing all the things that Dellar thinks they need to be doing.

The communicative tasks which make up a Dogme lesson don’t have the same aim as the speaking activities found in coursebooks. While the speaking activities in coursebooks are attempts to automate previously taught declarative knowledge, the communicative tasks that provide the backbone of Dogme teaching aim to give rise to unpredictable, spontaneous, emergent language which pushes the students’ developing interlanguage. Using current language rescources to carry out these tasks is how most of the learning happens; it’s the key to interlanguage development. It’s learning by doing – learning how to use the language by using texts and participating in authentic communicative exchanges, not by being told about it. This implicit learning leads directly to the procedural knowledge needed for listening comprehension, spontaneous speech, and fluency. So while Dellar’s question “How am I going to get the students to use this language?” is an important one for teachers using coursebooks, it’s a redundant question for Dogme teachers.

Still, we know that certain types of teacher intervention can speed up the rate of interlanguage development, and that it’s not enough to just get students talking about things in class. To do Dogme well, teachers need their bosses’ support: it’s not the teacher’s job to design and provide the curriculum. So they need access to a materials bank which includes a variety of texts to provide rich input, and a variety of tasks suitable to the varying needs and current levels of proficiency of the students. They also need experience in scaffolding tasks and giving feedback, and that calls for some expert training, on-going PD, including collaboration among colleagues, and lots of practice. But no teacher should be disuaded from putting down the coursebook and trying Dogme just because Dellar thinks it’s all “a huge ask” and too bloody difficult.  


Coşkun, A. (2017). Dogme: What do teachers and students think? International Journal of Research Studies in Language Learning, 6(2), 33-44.

Dellar, H. & Walkley, A. (2016) Teaching Lexically. Delta.

Meddings, L, & Thornbury, S. (2009). Teaching Unplugged: Dogme in English Language Teaching. Delta.

Nguyen, N.Q., & Bui Phu, H. (2020). The Dogme Approach: A Radical Perspective in Second Language Teaching in the Post-Methods Era. Journal of Language and Education, 6(3), 173-184.

Thornbury, S. (2020) Interview -go to the Wikipedia page on Dogme and click on Fottnote 8.

English Language Teaching: Now and How it Could Be

I’m very grateful to Paul Walsh who recently interviewed me about the book. Our publisher has asked me to plug it, so here’s a quick summary.

The most important “rationale” for the book is our belief that to teach languages well, you need to know how people learn them. Not only does current ELT practice largely ignore robust findings from 60 years of SLA research, it relies on systematic and deliberate misreprepresentation of these findings, in order to defend the inefficacious teaching practices required by the use of General English coursebooks.

So Section One of the book consists of six chapters which offer an up-to-date, accessible discussion of recent developments in knowledge about second and foreign language learning.

  • We describe Interlanguage development and the pathways that learners follow, and offer an explanation for such trajectories.
  • We then discuss questions relating to the rate of L2 development, the vexed issue of ultimate attainment, and the latest research findings on the long term aeffects of various types of insruction.
  • In Chapter 5, we offer an overview of the cognitive processes and products involved in SLA, paying particular attention to the roles of implicit and explicit learning and knowledge.
  • Finally, in Chapter 6, we look at the implications of research findings in SLA for instruction. Most importantly, we stress the need to prioritize incidental and implicit learning through the use of an analytic, rather than a synthetic syllabus, thus challenging the twin foundations of explcit teaching and synthetic syllabuses on which coursebook-driven ELT rest. 

Section 2 takes a detailed look at how adults are taught EFL and ESL and at how we got to the lamentable situation we find ourselves in today. Beginning in the early 1960s with  Situational Language Teaching, we trace the development of Communicative Language Teaching (CLT) and describe how the bright sparks of CLT were effectively snuffed out by the emergence of the modern General English coursebook. A critique of the domination of coursebook-based ELT is then offered, which leads to Chapter 8: “How English could be taught much better: TBLT”. Chapter 9 examines immersion approaches to ELT, particulalry “Content and Language Integrated Language Learning” and “English as the Medium of Instruction”. Three “pre-CLIL” empirical studies and three “post-CLIL” empirical studies are given detailed attention, leading to a discussion of three important new research questions.

Chapter 10 deals with how teachers are trained and evaluated today, and how it could be done better. This is a particularly important area, in our opinion, bringing together many of the criticisms I’ve made in this blog of the way in which Second Language Teacher Education (SLTE) is organised and carried out.

Section 3 is dedicated to the way that English language learning is evaluated. A historical overview of how language testing has changed leads into a discussion of “The dark side of language testing”, where the Cambridge Assessment Group, the IELTS tests, the English Testing Service and the washback effects of high stakes English proficiency tests are examined and critiqued. The section ends with suggestions on how assessment could be done much better.

Finally, Section 4 deals with political and socioeconomic issues. Here’s a bit from Part 2 of my interview with Paul Walsh .

The ELT industry has an annual turnover of close to US $200 billion. Apart from the huge profits made by publishers, examination boards, teacher training outfits and public and private educational institutions, there is the enormous “soft power” exerted by nation states through language policies.

To paraphrase Chapter 12 of the book, most language teaching involves the language of powerful nations being taught to speakers of less powerful ones. English has been the principal language of the two most economically dominant nation states of the past 300 years – first the UK, and then for the past 150 years, the USA – and not coincidentally, of the most powerful armies required to procure and maintain those colonies and economic dominance.

As a result of this history of savage imperial conquest, there are now roughly 400 million native speakers of English in the world, and over four times that number, 1.75 billion, for whom English is a second or auxiliary language. Already huge, the second group is growing fast, with more than two billion speakers projected by 2025. The ability to determine which shall be a country’s national language, or in the case of many multilingual societies, its lingua franca, is a vital source of power for nation states and for elites within them.

This is the single biggest reason why ELT is so important. When one country invades or annexes another, it is common for command of a particular, standardized form of the invader’s language to be required, officially or unofficially, of any members of the subjugated population seeking access to political power, employment, and key social services, especially education, or even for immigrant visas or citizenship. The newly imposed standardized form of the language sometimes not only displaces indigenous languages, but drives them, and often their speakers, close to extinction, as happened, for example, with Hawaiian, and many native-American and Australian aboriginal languages.

The book concludes with Chapter 13: “Signs of struggle: Towards an alternative organization of ELT”. We discuss:

Despite Mike Long’s clout, it was difficult to find a publisher for this book. We’re grateful to Cambridge Scholars for taking it on, and I hope it will conribute to the fight for a better future for ELT. The fight is a practical one. It involves organisation which begins at the local level and slowly builds an international network uniting progressive workers in the ELT industry who share a common political viewpoint.

Here in Catalonia, I look forward to the next meeting of the SLB Cooperative on July 1st., where I’ll give a 10 minute presentation of the book and join in the always convivial, sparky, conversations that ensue. See the SLB twitter feeds for more information and the chance to win a copy of the book.     

A Summary

This blog is mostly about the failure of teacher educators (TEs) in ELT to do their job well.

I here summarise three findings from psycholinguistic SLA research which have implications for ELT, and then review how some of today’s leading teacher educators have failed to deal with these findings.

Part 1: Implications for ELT of SLA research

1. Interlanguages

By the mid-1980s, research had made it clear that learning an L2 is a process whereby learners slowly develop their own autonomous mental grammar with its own internal organising principles. Today, after hundreds more studies, it is well established that acquisition of grammatical structures, and also of pronunciation features and many lexical features such as pre-fabricated lexical chunks, collocation and colligation, is typically gradual, incremental and slow. Development of the L2 exhibits plateaus, occasional movement away from, not toward, the L2, and U-shaped or zigzag trajectories rather than smooth, linear contours. No matter what the order or manner in which target-language structures are presented to them by teachers, learners analyze the input and come up with their own interim grammars, the product broadly conforming to developmental sequences observed in naturalistic settings. The acquisition sequences displayed in IL development have been shown to be impervious to explicit instruction, and the conclusion is that students don’t learn when and how a teacher decrees that they should, but only when they are developmentally ready to do so.

2. The roles of explicit and implicit knowledge and learning

Two types of knowledge are said to be involved in SLA, and the main difference between them is conscious awareness. Explicit L2 knowledge is knowledge which learners are aware of and which they can retrieve consciously from memory. It’s knowledge about language. In contrast, implicit L2 knowledge is knowledge of how to use language and it’s unconscious – learners don’t know that they know it, and they usually can’t verbalize it. (Note: the terms Declarative and Procedural knowledge are often used. While there are subtle differences, here I take them to mean the same as explicit and implicit knowledge of the L2.)

In terms of cognitive processing, learners need to use attentional resources to retrieve explicit knowledge from memory, which makes using explicit knowledge effortful and slow: the time taken to access explicit knowledge is such that it doesn’t allow for quick and uninterrupted language production. In contrast, learners can access implicit knowledge quickly and unconsciously, allowing it to be used for unplanned language production.

Three Interface Hypotheses

While it’s now generally accepted that declarative and procedural knowledge are learned in different ways, stored separately and retrieved differently, disagreement among SLA scholars continues about this question: Can the explicit knowledge students get from classroom instruction be converted, through practice, into implicit knowledge? Those who hold the “No Interface” position answer “No”. Others take the “Weak Interface” position which argues that there is a relationship between the two types of knowledge and that they work together during L2 production. Still others take the “Strong Interface” position, based on the assumption that explicit knowledge can and does become implicit, and that explicit explanation of the L2 should generally precede practice. In this view, procedural knowledge can be the result of declarative knowledge becoming automatic through practice.

The main theoretical support for the No Interface position is Krashen’s Monitor theory, which has few adherents these days, despite the reappraisal discussed in my previous post. The Strong Interface case gets its theoretical expression from Skill Acquisition Theory, which describes the process of declarative knowledge becoming proceduralised and is most notably championed by DeKeyser. This general learning theory clashes with evidence from L1 acquisition and with interlanguage findings discussed above. The Weak Interface position is adopted by most SLA scholars, including those who support the emergentist theory of SLA championed by Nick Ellis. Ellis agues that adult learners of English as an L2 are affected by their L1 in such a way that they don’t implicitly learn certain features of the L2 which clash with their L1 (see the section on maturational restraints below). Consequently, in this view, explicit instruction of a certain sort can draw attention to these features and thereby “re-set the dial”, allowing for the usual implicit learning of further instances of these features to re-enforce procedural knowledge. 

 Whatever their differences, there is today a consensus among scholars that implicit learning is the “default” mechanism of SLA. Wong, Gil & Marsden (2014) conclude that implicit knowledge is in fact ‘better’ than explicit knowledge; it is automatic, fast – the basic components of fluency – and more lasting because it’s the result of the deeper entrenchment which comes from repeated activation. Doughty (2003) concludes:In sum, the findings of a pervasive implicit mode of learning, and the limited role of explicit learning …, point to a default mode for SLA that is fundamentally implicit, and to the need to avoid declarative knowledge when designing L2 pedagogical procedures.  

Neither Wong, nor Doughty challenge the important role that explicit knowledge plays in SLA. However, what they firmly reject, as do most of their colleagues, is the view that declarative knowledge is a necessary first step in the SLA process. 

3. Maturational constraints on adult SLA

The limited ability of adults to learn a second language implicitly as children do brings us to “Critical Period” research. Long (2007) in an extensive review of the literature, concludes that there are indeed critical periods for SLA, or “sensitive periods”, as they’re now called. For most L2 learners, the sensitive period for native-like phonology closes between age 4 to 6; for the lexicon (particularly lexical chunks, collocation and colligation) between 6 and 10; and for morphology and syntax by the mid-teens. While this remains a controversial area, there’s general consensus that adults are partially “disabled” language learners who can’t learn in the same way children do. And that’s where explicit learning comes in. As suggested above, the right kind of explicit teaching can help adult students learn bits of the language that they are unlikely to learn implicitly. Long calls these bits “fragile” features of the L2 – features that are of low perceptual saliency (because they’re infrequent / irregular / semantically empty / communicatively redundant / involving complex form-meaning mappings), and he says these are likely to be late, or never, learned without explicit learning.


From all this research, a picture of ELT emerges where teachers help students to develop their interlanguages by giving them scaffolded opportunities to use the L2 in communicative activities where the focus is on meaning. The Dogme approach does this, so do some types of immersion and CLIL courses, and so do strong versions of Task-based Language Teaching. During the performance of tasks, modified, enhanced, multi-modal written and spoken texts give the rich input required, and teachers give students help with aspects of the language that they’re having problems with by brief switches to what Long calls “focus on form” – reactive attention to formal aspects of the L2 that the students indicate, through their production, are impeding their progress. In most forms of TBLT, tasks are divided into 3-stages: pre-task -> task -> post task, and as a general rule, we can say that the more explicit instruction is given priority, the weaker the TBLT version is.

From the research discussed, it follows that a relatively inefficacious way of organising ELT courses is to use a General English coursebook. Here, the English language is broken down into constituent parts or “items” which are then contextualized, explained, and practiced sequentially, following the scale laid down in the Common European Framework of Reference for Languages (CEFR), which is based not on empirical research into interlanguage development, but rather on teachers’ intuitive ideas of an easy-to-difficult progression in L2 learning. The teacher’s main concern is with explaining and practicing bits of grammar, pronunciation and lexis by reading and listening to short texts, studying boxes which summarise grammar points, doing follow-up exercises, talking about bits of the language, giving summaries, engaging in IRE (Initiation-Response-Feedback) exchanges with students, and then monitoring students’ activities which are supposed to practice what has been taught. Typically, in such courses, teachers talk for 70%+ of the time, and students’ speaking turns last for less than a minute.

For example, if we look at Unit 2 from Outcomes Intermediate, we see this:

  1. Vocab. (feelings) →
  2. Grammar (be, feel, look, seem, sound + adj.) →
  3. Listening (How do they feel?) →
  4. Developing Conversations (Response expressions) →
  5. Speaking (Talking about problems) →
  6. Pronunciation (Rising &falling stress) →
  7. Conversation Practice (Good / bad news) →
  8. Speaking (Physical greetings) →
  9. Reading (The man who hugged) →
  10. Vocabulary (Adj. Collocations) →
  11. Grammar (ing and ed adjs.) →
  12. Speaking (based on reading text) →
  13. Grammar (Present tenses) →
  14. Listening (Shopping) →
  15. Grammar (Present cont.) →
  16. Developing conversations (Excuses) →
  17. Speaking (Ideas of heaven and hell).

(Note that “Developing Conversations” are not oral activities.) Given that teachers must cover this unit in approx.10 hours, and given the amount of work students are expected to do studying the language, how much opportunity will students get to use the language for themselves in spontaneous communicative exchanges which push their interlanguage development? Like most General English coursebooks, Outcomes Intermediate focuses on explicit teaching, based on the false assumption that students will learn what they’re taught in this way. The most usual defense of coursebooks (apart from their convenience) is that they are “adapted” in a myriad of ingenious ways by teachers. Mishan (2021) cites Bolster’s (2014, 2015) study of teachers using an English for academic purposes coursebook, which showed that there was a spread of “25% to 100% of changes made to the published material, with an average percentage of adaptation of 64.5’ (p. 20).  If only 35%  of the coursebook’s content are used, one wonders just how convenient they are! Teachers are to be congratulated for the way they ameliorate the deficiencies of coursebooks, but it remains the case that they are forced to follow the synthetic syllabus laid down in the coursebook they use, which means they are making impossible demands of students and spending far too much time on explicit teaching.

In brief, research suggests that L2 learning is mostly a process of the unconscious development of interlanguages which is best helped by giving students opportunities to use the language in such a way that they work out how the L2 works for themselves. Teachers can best help this development by following a syllabus which supplies rich input, interesting, relevant tasks, and which counts on timely feedback and support from the teacher. The implication is that, when it comes to ELT, using an analytical syllabus will be more efficacious than using the synthetic syllabuses implemented in General English coursebooks. (For more on synthetic versus analytic syllabus types, see the 2 posts ‘Synthetic and Analytic Syllabuses 1′   and ‘Synthetic and Analytic Syllabuses Part 2′   Se also the post ‘Why Teach Grammar’.

Compare these two views. Caroll (1966: 96) articulated the “old” view:

Once the student has a proper degree of cognitive control over the structure of a language, facility will develop automatically with the use of the language in meaningful situations.

Hatch (1978: 404) was one of the first scholars to articulate the current view:

Language learning evolves out of learning how to carry on conversations. One learns how to do conversation, one learns how to interact verbally, and out of this interaction syntactic structures are developed.

Hatch’s work in SLA research was influential in promoting the communicative language teaching approach, an exciting new flame which burned brightly for a few years in the 1980s, only to be snuffed out by the arrival of modern coursebooks in the early 1990s. 

Part 2: The Contribution of Teacher Educators  

Teacher educators teach the teachers: they’re the purveyors of today’s lamentable Second Language Teacher Education (SLTE) programmes. They give “pre-service courses” for those wanting to start a teaching career, and “in-service courses” for those already teaching. Given the British Council’s (2015) conservative estimate of 12 million teachers working in ELT, training them is obviously a multi-billion dollar industry. The most popular pre-service courses in many parts of the world are CELTA and Trinity College’s Cert TESOL. Neither of these courses gives any serious attention to how people learn an L2 – the SLA research findings outlined above are largely ignored. Both courses concentrate on the practical job of preparing teachers, as best they can in the limited time provided, to implement the synthetic syllabuses used in the vast majority of schools and institutions offering courses of English as an L2. While the teacher educators who run these courses are not obliged to recommend using a coursebook, in practice, most of them do, and they use coursebooks for the teaching practice modules.  In brief, both courses give almost no attention to how people learn an L2. And both are based on the unquestioned, but demonstrably false assumption that explicit teaching of the formal elements of the L2 is the key to efficacious teaching.  

In the USA, China and other countries, teachers need to have a university degree, and then do a post graduate pre-service course. Some do a Masters in TEFL or TESOL, while others do a post-graduate Certificate or Diploma. In these programmes, more attention is paid to second language learning, but there is enormous variety among the programmes, making it difficult to generalise. Certainly in the USA, the pre-service courses seem more likely to have a positive affect on teaching practice than CELTA or the Cert TSOL. In China and other countries training non-native speaker (NNS) teachers, it seems that once they start their jobs, teachers often ignore what they were told about the importance of implicit learning, and the value of a communicative language teaching approach. Two explantions are suggested. First, there is a strong tendency among teachers to teach the way that they themselves were taught. Second, most NNS teachers admit to having difficulties expressing themselves accurately and fluently in English. Ironically, their difficulties mainly spring from the way they were taught, but still, the combination of bias and insecurity push teachers to adopt a “teacher tells the class about the language” approach where most of the time is dedicated to using a coursebook to instill declarative knowledge about English grammar, pronunciation and vocabulary.

As to in-service training, often referred to as Continuous Professional Development (CPD), there are literally thousands of private commercial concerns offering courses in every aspect of ELT, making this another multi-billion dollar part of the ELT industry.

Who are the teacher educators (TEs)? Right at the top, we have figures such as David Nunan and Jack Richards, both successful academics who have, over the past 40+ years, given hundreds of university courses and hundreds of plenaries at international conferences. They have worked as consultants for national governments, written more than 20 books each covering various aspects of learning and teaching an L2, and they have also both written more than a dozen series of General English coursebooks, some aimed specifically at the huge, expanding Chinese market. Both are multi-millionaires. Richards was always conservative, while Nunan only slowly grew to be so. In the 1980s at least, Nunan was an articulate, innovative scholar, as can be appreciated in some of articles in his Learner-Centered English Language Education collection. Nunan also supervised the PhD dissertations of many who went on to make innovative advances to theories of SLA and to ELT practices. I attended courses given by Nunan which pushed me towards a TBLT approach and I was impressed with his scholarship and his unfailing willingness to shot the breeze with his students.

 Whatever their academic records, both Richards and Nunan made significant contributions to the new generation of coursebooks in the late 1990s, when publishers responded to new conditions with a  multi-million-pound revamp that ushered in the ‘global coursebook’ – a new, glossier, multi-component package aimed at the global market, but often carefully tweaked for more local teaching contexts. This “advance” effectively put an end to any version of CLT worth the name. To my knowledge, in the last 20 years neither Richards nor Nunan has given any courses on, or made any serious attempts to promote an interest in, the mounting evidence from SLA which I outlined above. As a result, I think they are partly responsible for the reactionary, commercially-driven character of current SLTE.

 The majority of today’s most successful TEs are, like Richards and Nunan, coursebook writers. Alas, they have to content themselves with six figure incomes – the money coming from their coursebook series is no longer enough to make them millionaires. This is thanks to publishers’ new business plans, where an overseeing editor designs the coursebook series and its components, and then farms out the work to the lucky winners  who are chosen to do the real work for scraps. In exactly the same way as most employers in our neoliberal world treat their employees, the editor commissions “independent collaborators” to write various bits of the coursebooks following strict editorial guidelines for a set fee, and that’s all they see of the pie.  While Richards and Nunan, like Mr. and Mrs. Sears of Headway fame, have already banked millions of dollars from sales of their coursebooks, and the royalties still roll in, more recent TE coursebook writers are less fortunate. They make a small fraction of the money that used to be made from a best-selling coursebook series, and they now have to fight in a much more competitive market than their predecessors when trying to boost their incomes by writing supplementary materials and “How to Teach” books. Like their predecessors, they get further monies from fees paid to them for a wide range of CPD offerings, ranging from conference plenaries to giving presentations, workshops and short courses on “How to improve your teaching” all over the planet, sold to the highest bidder.

Two examples of today’s TEs are Jeremy Harmer and Hugh Dellar. Harmer has published more than 30 books on ELT, and made a few rather unsuccesful attempts to get into the coursebooks market. He is often referred to as “El Maestro”. His seminal work, The Practice of English Language Teaching is now in its 5th edition, has sold millions, and is required reading for teachers doing not just CELTA but also DELTA courses. The book is also listed in the bibliographies of most Masters courses in TESOL / TEFL offered by universities around the world. The book is 550 pages long, yet just one small chapter is devoted to language learning – another chapter devoted to classroom seating arrangements is longer! The chapter on language learning misrepresents the work of most of the leading scholars of SLA, including Krashen, Pienemann, Gass, Long and N. Ellis. Suffice it to say, in summary, that Harmer has done very little to inform teachers about the matters discussed in Part 1 of this essay.

Dellar is the co-author of the Outcomes and Innovations series of coursebooks, and also of one of the books in the Roadmap series. His Teaching Lexically book, co-written with Walkley, offers by far the worst summary of how people learn an L2 that I’ve ever read. I’ve written a post that reviews the book, so let me just say here that the “explanation” of L2 learning it gives is ridiculous. It paves the way for an approach to ELT that is remarkable for its emphasis on teaching students about the language, No other teacher educator today insists as much as Dellar does on the importance of explicit learning.

Let’s look briefly at a few more prominent teacher educators

Gianfranco Conti

The “MARS-EARS” framework for L2 teaching attempts to justify an “Explain-first-and-practice later” approach to teaching L2s. Conti has built himself into a brand. He spends enormous effort on promoting the brand and he tours the world promoting himself and his method. Conti and Smith are co-authors of of  a book on memory which badly misreprents research findings and blatantly promotes the “MARS-EARS” framework. See my post on the book Memory and Teaching and my post on Conti’s approach, Genius for a fuller discussion.

Jason Anderson

I’ve discussed Anderson’s work in a few posts – put “Anderson” in the Search box on the right. Common to all Anderson’s work is a defense of coursebook-driven ELT. Anderson’s work relies on cherry-picking SLA research findings and has little depth or critical acumen.

Rachel Roberts

Roberts is a quintessential example of a TE. She now concentrates on wellness training, but she has a history of teaching teachers that spans decades. She has never, in her long career – described by the British Council as “illustrious” – given the slightest importance to SLA research. Examine her work and search in vain for any serious attention to how people learn an L2.  

Tyson Seburn

Seburn was until recently the coordinator of the IATEFL Teacher Development Special Interest Group. In my opinion, he’s a good example of all that’s wrong with the way teacher educators see their jobs. Like Roberts, Seburn has never shown any interest in SLA research or in its implications for ELT. For Seburn, teacher development is primarily about identity, about “how I came to be who I am”, “how to be the best person I can be”, and all that stuff.

Scott Thornbury

Here’s the exception, the star who shines in the dull, lack-luster TA firmament. I’ve done a few posts criticising Thornbury’s work – his Natural Grammar, his ill-informed criticisms of Chomsky, his wild attempts to describe and promote emergentist theories – but he remains a splendid beacon, shining through the fog, demanding change. He knows his stuff (mostly!) about SLA, and he’s a brilliant speaker, the best performer on the big conference stage since the wonderful John Fanselow. Thornbury has never published a coursebook series, indeed, he’s a leading critic of them, the one who coined the term “Grammar McNuggets” which so acutely captures the way that coursebooks chop up, sanitise and process the life out of the English language.

Thornbury, along with his co-author Meddings, is the man behind Dogme, an approach to ELT that rejects the use of synthetic syllabuses and adopts an approach that gives full recognition to the research findings outlined in Pat 1 above. Thornbury gets it, he understands what research tells us about how people learn an L2, he recognizes that we learn by doing, and he strives to implement a radical alternative to ELT.  He sponsors the Heads Up project, he goes wherever he’s invited to talk to teachers about Dogme and he somehow makes few enemies among those who work so hard to maintain the status quo. He’s my hero, as he is for tens of thousands of forward-looking teachers.

The Three Neils

Neil McMillan is the founder of the SLB cooperative. There’s an important political dimension to his work – involvement in the local community, social change, teachers’ rights – but when it comes to teaching, he walks the talk about a strong version of TBLT. I’m proud to have worked with him on courses for teachers interested in TBLT, and I’m looking forward to further projects with him, where we further explore how ELT can respond to local needs.

Neil Anderson and Neil McCutcheon are the co-authors of Activities for Task-based Learning. They start from a well-considered appreciation of SLA findings. They show a sensitive appreciation for the contexts in which teachers have to work, and they propose a variety of practical ways in which teachers can move towards a new, better way of doing their jobs. They’re pragmatists, realists you could say, but there’s an undeniably progressive tone to their work, and I’m sure that we’ll hear more from them soon. They’re inspiring, they give me hope.


ELT is a huge, multi-billion dollar industry. It’s not surprising that commercial interests shape the way it’s done. But it’s inefficacious: most students of English as an L2 fail to achieve communicative competence. To be clear: most students of English as an L2 leave the courses they’ve done without the ability to use English well enough to easily cope with the demands they meet when they try to use English in the real world. They’ve been cheated. They’ve been led though a succession of courses where they’ve been taught about the language and denied sufficient opportunities to use the language in ways that help them develop communicative competence.

Leading teacher trainers have a vested interest in protecting the inefficacious model of coursebook-driven ELT – they write coursebooks, after all. The way towards a more efficacious model of ELT, depends on the dismantling of the current established paradigm for ELT which is based on the CEFR scale of language proficiency. Learning English as a second language has very little to do with the imagined progression from A1 to C2 enshrined in the CEFR, and thus, very little to do with the coursebook series which adopt the same daft idea of l2 learning.

ELT must change. It must recognize that learning English as an L2 is mostly done by using it, not by being told about it. Teacher trainers today, with a few exceptions, stand in the way of change.


Anderson, N. & McCutcheon, N. (2021) Activities For Task-Based Learning. Delta.

Carroll, J. B. (1966). ‘The contribution of psychological theory and educational research to the teaching of
foreign languages’ in A. Valdman (ed.). Trends in Language Teaching. McGraw-Hill, 93–106.

Doughty, C. J. 2003. ‘Instructed SLA: Constraints, compensation, and enhancement’ in C. J. Doughty,
and M. H. Long (eds.),The Handbook of Second Language Acquisition. Blackwell, 256–310.

Dellar, H. & Walkley, A. (2017). Teaching Lexically: Principles and practice. Delta.

Harmer, J. (2015) The Practice of English Language Teaching. McMillan.

Hatch, E. (ed.). (1978). Second Language Acquisition: A Book of Readings. Newbury House.

Long, M. (2007). Problems in SLA. Lawrence Erlbaum.

Meddings, L. and S. Thornbury. (2009). Teaching Unplugged. Delta.

Mishan, F. (2021). The Global ELT coursebook: A case of Cinderella’s slipper? Language Teaching, 1-16.  

Whong, M., Gil, K. H., & Marsden, H. (2014). Beyond paradigm: The ‘what’ and the ‘how’ of classroom research. Second Language Research, 30(4), 551-568.     

A Review of Part 3 of “After Whiteness”

Part 3 of After Whiteness by Gerald, Ramjattan & Stillar is now free to view on the Language Magazine website. It’s as bullshit-rich and content-poor as the first two parts; another mightily-righteous, mini-sermon which has the authors standing on the same flimsy pedestal (a rickety construction of parroted bits of Marxism, punitive moral dictums on racism and straw-man arguments) in order to preach to the choir. It’s about as edifying as a Chick tract. I’ll give a summary of the article and then comment.    

The article has 5 sections.


Part 1 looked at pedagogical ways of challenging Whiteness. Part 2 “re-imagined” training and labor in English language teaching. Part 3 will look at ideas for “how the broader ELT industry could evolve” if “Whiteness” were “successfully decentered”.

Action Research as a Goal

A post-Whiteness ELT, we’re told, should be part of a post-Whiteness world in which ELT practitioners “strive for some micro- or meso-level changes in their contexts to combat Whiteness“. The only information offered about these new world “micro- or meso-level changes” is about pronunciation teaching, where in the post-Whiteness world, teachers would pay attention to “how their students’ racialization in society can shape external perceptions of their intelligibility and how these perceptions have material consequences”. One “material consequence” is alluded to:   

white “foreign-accented” job applicants are typically perceived as more intelligible/employable than their racialized counterparts, thereby suggesting that there are racial hierarchies when it comes to assessments of employability in relation to speech accent (Hosoda and Stone-Romero, 2010).  

To challenge these inequalities, “teachers need to use their pedagogy”.  For example,

teachers and students could engage in some sort of action research where they interrogate and challenge local employers’ aversion to hiring racialized “foreign-accented” applicants, which has the potential to substantively shift hiring policies in students’ communities.

“some sort of action research”? Really?

The Un-Canon of Lived Experience

This section suggests that “the canon” of ideas about English should be “removed, but not replaced”. This involves using “extensive student-generated input” to “dismantle linguistic and racialized hierarchies within the conceptualization of English”. Students can be asked to note how their neighbors and relatives use English and share the data with classmates as part of an “epistemological shift”, aimed at overcoming the idea of the “ownership of English”. Widdowson (1994) (sic) is cited to support the claim that standardized English is the property of White native speakers from the global North who shape the language as they see fit. Such a “White supremacist, capitalist notion of language” must be replaced by the view that English belongs to nobody: it is a community resource. This illuminating example is offered:

… when we see the word prepone, a word in so-called Indian English meaning to move an event ahead of schedule (Widdowson, 1994), it is important to remember that this is not a “made-up” word but rather a concise and useful antonym for postpone. If you were teaching students who needed to interact with Indian English users, why would you not want to teach such an innovative word?

That last sentence is the funniest example of a rhetorical question I’ve seen for quite some time!

Teaching the Perceiving Subjects

The section begins by redressing the deficit which results from “idealizing Whiteness and the ideologies that descend from it” (sic) through the imposition of standardized English. While teaching students different Englishes might help redress this deficit, the authors want to go much further. Why not, they ask, treat “minoritized varieties” as “the ideal”? They don’t explain what this radical proposal entails. What “minoritized varieties” would be included? How would these varieties form “the ideal”? What would it look like and sound like?

Moving quickly on, the authors ask the further question:

 “How might the White perceiving subject (Flores and Rosa, 2015) be taught to perceive more effectively?”

Again, they don’t explain what they’re talking about. What does the “White perceiving subject” refer to? Perhaps they assume that all readers are familiar with the Flores and Rosa (2015) article, or perhaps they’ve seen Flores’ helpful tweet:

The white listening subject is an ideological position that can be inhabited by any institutional actor regardless of their racial identity. It behooves all of us to be vigilant about how hegemonic modes of perception shape our interpretation of racialized language practices.

or perhaps they’ve read Rosa’s (2017) follow-up, where he explains that

the linguistic interpretations of white listening subjects are part of a broader, racialized semiotics of white perceiving subjects.

Anyway, let’s take it they mean that white subjects (whoever they are – noting that they’re not restricted to people with “white” features) should try to empathize with “racialized” people. Returning to the teaching of pronunciation, the authors suggest that teachers should be given time to practice listening to different Englishes so that they gain a certain level of experience with the population they might want to work with.  And this somehow demonstrates that until ELT practitioners are “freed from the monolingual cage they’re in”, so long as “raciolinguistic ideologies” are in place, the “racialized languagers” will always fail.


The authors admit that what they sketch out in their three articles is “something of a dream”, but they believe it can become reality “if we take the leap to a world that doesn’t yet exist”. Their ideas are born of love, not hatred. Their goal is to replace the “harmful, oppressive and, at heart, ineffective” practices which keep “racialized learners and languagers in their place below the dominant group”.

What does this view of how the broader ELT industry could evolve if “Whiteness” were successfully “decentered” amount to?   

The first section on Action Research doesn’t make any sense. In a “post-Whiteness wold” where Whiteness has been swept away, surely there’s no longer any need for teachers to “strive for some micro- or meso-level changes in their contexts to combat Whiteness”, or to fight against job adverts that discriminate against NNSs. Apart from this incongruity, the only content in this section is the lame, undeveloped suggestion that teachers and students engage in “some sort of action research”, where the goal is to challenge employers’ prejudice against “foreign-accented” applicants.  

The “Un-Canon of Lived Experience” section proposes that the English language belongs to nobody: it’s a community resource. Apart from the bizarre example of promoting the use of the word “prepone”, this is little more than a motherhood statement until it’s properly developed. The authors assert that before we get to the hallowed post-Whiteness society, we must sweep away “Whiteness ideologies” which adopt a “White supremacist, capitalist notion of language”, and yet nowhere in their three-part series do they make any attempt to unpack the constructs of “ideology” or “capitalism” so as to explain what they mean when they say that language is a capitalist notion. Even less do they show any understanding of Marxism, or any other radical literature which makes coherent proposals for how capitalism can be overthrown and how that might lead to radical changes in education.

The “Teaching the Perceiving Subjects” section proposes that ELT should replace the teaching of standardized English with teaching English where “minoritized varieties” are used as “the ideal”. I’ve already suggested above that this is an empty proposal. Until the vague idea of making “minoritized varieties” form “the ideal” for English is properly outlined and incorporated into some minimum suggestions for new syllabuses, materials, pedagogic procedures and assessment tools, it’s no more than hand-waving rhetoric, typical of the lazy, faux academic posturing which pervades the “Against Whiteness” articles.

The re-education programme for “White perceiving subjects” doesn’t explain who the “subjects” are, and it doesn’t explain how they are to be re-educated; it sounds a bit scary to me, a bit too close to the views of Stalin, Mao and others determined to stamp out “wrong thinking”. Still, as usual, we’re not told what’s involved, except for the perfectly reasonable suggestion that teachers should be more aware of, and sympathetic to, different Englishes.

The 3-part series of articles Against Whiteness fails to present a coherent, evidence-supported argument. Students of instructed SLA will find absolutely nothing of interest here, unless they want to deconstruct the text so as to reveal the awful extent of its empty noise. Likewise, radical teachers looking for ways to challenge the commodification of education, fight racial discrimination, and move beyond the reactionary views of English and the offensive stereotyping which permeate ELT materials and practices, will find nothing of practical use here. They should look, instead, to the increasing number of radical ELT groups and blogs that offer much better-informed political analyses and far more helpful practical support. In stark contrast to Gerald, Ramjattan & Stillar, such groups and individuals not only produce clear, coherent and cohesive texts, they also DO things – practical things that make a difference and push change in the ELT industry. The SLB Cooperative; ELT Advocacy, Ireland; the Gorillas Workers Collective; the Hands Up Project; the Part & Parcel project; the Teachers as Workers group; the on-line blogs, social media engagement and published work of Steve Brown, Neil McMillan, Rose Bard, Jessica MacKay, Ljiljana Havran, Paul Walsh, Scott Thornbury, Cathy Doughty, David Block, Pau Bori, are just a few counter examples which highlight the feebleness of the dire, unedifying dross dished up in the After Whiteness articles.       

Re-visiting Krashen

The first 2021 issue of the Foreign Language Annals Journal has a special section devoted to a discussion of “Krashen forty years later”. The lead article by Lichten and Van Patten asks “Was Krashen right?” and concludes that yes, mostly he was.

Lichten and Van Patten look at 3 issues:

  • The Acquisition‐Learning Distinction,
  •  The Natural Order Hypothesis,
  • The Input Hypothesis.

And they argue that “these ideas persist today as the following constructs:

  • implicit versus explicit learning,
  • ordered development,
  • a central role for communicatively embedded input in all theories of second language acquisition”.

The following updates to Krashen’s work are offered:

1.  The Acquisition/learning distinction

The complex and abstract mental representation of language is mainly built up through implicit learning processes as learners attempt to comprehend messages directed to them in the language. Explicit learning plays a more minor role in the language acquisition process, contributing to metalinguistic knowledge rather than mental representation of language.

2. The Natural Order Hypothesis

This is replaced with the ‘Ordered Development Hypothesis’:

The evolution of the learner’s linguistic system occurs in ordered and predictable ways, and is largely impervious to outside influence such as instruction and explicit practice.

3 The Input Hypothesis

The principal data for the acquisition of language is found in the communicatively embedded comprehensible input that learners receive. Comprehension precedes production in the acquisition process.

Pedagogic Implications

Finally, the authors suggest 2 pedagogic implications:

1. Learners need exposure to communicatively embedded input in order for language to grow in their heads. …Leaners should be actively engaged in trying to comprehend language and interpret meaning from the outset.

2. The explicit teaching, learning, and testing of textbook grammar rules and grammatical forms should be minimized, as it does not lead directly or even indirectly to the development of mental representation that underlies language use. Instructors need to understand that the explicit learning of surface features and rules of language leads to explicit knowledge of the same, but that this explicit knowledge plays little to no role in language acquisition as normally defined.   


Note the clear teaching implications, particularly this: the explicit learning of grammar rules leads to explicit knowledge which plays “little to no role” in language acquisition.

What reasons and evidence do the authors give to support their arguments? They draw on more than 50 years research into SLA by those who focus on the psychological process of language learning, of what goes on in the mind of a language learner. They demonstrate that we learn languages in a way that differs from the way we learn other subjects like geography or biology. The difference between declarative and procedural knowledge is fundamental to an understanding of language learning. The more we learn about the psychological process of language learning, the more we appreciate the distinction between learning about an L2,and learning how to use it for communicative purposes.

All the evidence of SLA research refutes the current approach to ELT which is based on the false assumption that learners need to have the L2 explained to them, bit by bit, before they can practice using it, bit by bit. All the evidence suggests that language is not a subject in the curriculum best treated as an object of study. Rather, learning an L2 is best done by engaging learners in using it, allowing learners to slowly work out for themselves, through implicit development of their interlanguages, how the L2 works, albeit with timely teacher interventions that can speed up the process.     

Translanguaging: A Summary  

Translanguaging is baloney. There’s almost nothing in all the dross published that you should pay attention to. It’s a passing fad, a blip, a mistake, a soon to be forgotten episode in the history of ELT and applied linguistics.

Translanguaging, as presented by Garcia, Flores, Rosa, Li Wei and others is an incoherent, political dogma, i.e., ‘a principle or set of principles laid down by an authority as incontrovertibly true’. There’s no way you can challenge translanguaging: you either accept it or get branded as a racist, or a reactionary, or, what it really comes to, an unbeliever. When you read the works of its “top scholars”, you’re bombarded with jargon and obscurantist prose that disguises a disgraceful lack of command of the matters dealt with. These people demonstrate an abysmal lack of understanding of Marx, or Friere, or Foucault, for example, or of Halliday even. They demonstrate an ignorance of philosophy, of political thought, of the philosophy of science, and even of linguistics, for God’s sake. They’re imposters! They talk of colonialism, capitalism and neoliberalism as if the very use of the words is evidence enough that they know what the words mean. They nowhere – I repeat nowhere – give any coherent account of their political stance. I bet they don’t know Gramsci from granola.

Furthermore, they give few signs of any understanding of how people learn languages, or of how ELT is currently organised and structured. Perhaps worst of all, they show a general ignorance of what’s actually going on in ELT classrooms; they contribute little of practical use to progressive teaching practice; and they are mostly silent when it comes to support for grassroot actions by teachers to challenge their bosses. They’re theorists, seemingly unconnected to the “praxis” they claim to champion. What, one has a right to ask, has translanguaging ever done to promote real change in the lives of those who work in the ELT industry?   

And that’s the top echelons – that’s the established academics! Go down a few hundred steps in the pecking order and take a look at what the academic wanabees like Ramjattan, Stillar, Gerald and Vas Bauler are doing. They don’t publish much in academic journals, but they’re busy on Twitter and other social media channels. I invite you to go to Twitter and see what they have to say. They delight their thousands of followers with a nausiating flow of “us-versus-them, we’re-right-they’re-wrong” tweets, plus blatently self-promtional bits of junk about how their eagerly-awaited book is coming along, and polite requests for money to help them keep writing. This is where you’ll see translanguaging at its rawest: thousands of people, all in a bubble, all convinced of their righteousness, all “liking” the baloney churned out by their scribes – a motley crew of puffed-up imposters.