Teacher Trainers in ELT

This blog is dedicated to improving the quality of teacher training and development in ELT.


The Teacher Trainers 

The most influential ELT teacher trainers are those who publish “How to teach” books and articles, have on-line blogs and a big presence on social media, give presentations at ELT conferences, and travel around the world giving workshops and teacher training & development courses. Among them are: Jeremy Harmer, Penny Ur, Nicky Hockley, Adrian Underhill, Hugh Dellar, Sandy Millin, David Deubelbeiss, Jim Scrivener, Willy Cardoso, Peter Medgyes, Mario Saraceni, Dat Bao, Tom Farrell, Tamas Kiss, Richard Watson-Todd, David Hill, Brian Tomlinson, Rod Bolitho, Adi Rajan, Chris Farrell, Marisa Constantinides, Vicki Hollet, Scott Thornbury, and Lizzie Pinard. I apppreciate that this is a rather “British” list, and I’d be interested to hear suggestions about who else should be included. Apart from these individuals, the Teacher Development Special Interest Groups (TD SIGs) in TESOL and IATEFL also have some influence.

What’s the problem? 

Most current teacher trainers and TD groups pay too little attention to the question “What are we doing?”, and the follow-up question “Is what we’re doing effective?”. The assumption that students will learn what they’re taught is left unchallenged, and trainers concentrate either on coping with the trials and tribulations of being a language teacher (keeping fresh, avoiding burn-out, growing professionally and personally) or on improving classroom practice. As to the latter, they look at new ways to present grammar structures and vocabulary, better ways to check comprehension of what’s been presented, more imaginative ways to use the whiteboard to summarise it, and more engaging activities to practice it.  A good example of this is Adrian Underhill and Jim Scrivener’s “Demand High” project, which leaves unquestioned the well-established framework for ELT and concentrates on doing the same things better. In all this, those responsible for teacher development simply assume that current ELT practice efficiently facilitates language learning.  But does it? Does the present model of ELT actually deliver the goods, and is making small, incremental changes to it the best way to bring about improvements? To put it another way, is current ELT practice efficacious, and is current TD leading to significant improvement? Are teachers making the most effective use of their time? Are they maximising their students’ chances of reaching their goals?

As Bill VanPatten argues in his plenary at the BAAL 2018 conference, language teaching can only be effective if it comes from an understanding of how people learn languages.  In 1967, Pit Corder was the first to suggest that the only way to make progress in language teaching is to start from knowledge about how people actually learn languages. Then, in 1972, Larry Selinker suggested that instruction on formal properties of language has a negligible impact (if any) on real development in the learner.  Next, in 1983, Mike Long raised the issue again of whether instruction on formal properties of language made a difference in acquisition.  Since these important publications, hundreds of empirical studies have been published on everything from the effects of instruction to the effects of error correction and feedback. This research in turn has resulted in meta-analyses and overviews that can be used to measure the impact of instruction on SLA. All the research indicates that the current, deeply entrenched approach to ELT, where most classroom time is dedicated to explicit instruction, vastly over-estimates the efficacy of such instruction.

So in order to answer the question “Is what we’re doing effective?”, we need to periodically re-visit questions about how people learn languages. Most teachers are aware that we learn our first language/s unconsciously and that explicit learning about the language plays a minor role, but they don’t know much about how people learn an L2. In particular, few teachers know that the consensus of opinion among SLA scholars is that implicit learning through using the target language for relevant, communicative  purposes is far more important than explicit instruction about the language. Here are just 4 examples from the literature:

1. Doughty, (2003) concludes her chapter on instructed SLA by saying:

In sum, the findings of a pervasive implicit mode of learning, and the limited role of explicit learning in improving performance in complex control tasks, point to a default mode for SLA that is fundamentally implicit, and to the need to avoid declarative knowledge when designing L2 pedagogical procedures.

2. Nick Ellis (2005) says:

the bulk of language acquisition is implicit learning from usage. Most knowledge is tacit knowledge; most learning is implicit; the vast majority of our cognitive processing is unconscious.

3. Whong, Gil and Marsden’s (2014) review of a wide body of studies in SLA concludes:

“Implicit learning is more basic and more important  than explicit learning, and superior.  Access to implicit knowledge is automatic and fast, and is what underlies listening comprehension, spontaneous  speech, and fluency. It is the result of deeper processing and is more durable as a result, and it obviates the need for explicit knowledge, freeing up attentional resources for a speaker to focus on message content”.

4. ZhaoHong, H. and Nassaji, H. (2018) review 35 years of instructed SLA research, and, citing the latest meta-analysis, they say:

On the relative effectiveness of explicit vs. implicit instruction, Kang et al. reported no significant difference in short-term effects but a significant difference in longer-term effects with implicit instruction outperforming explicit instruction.

Despite lots of other disagreements among themselves, the vast majority of SLA scholars agree on this crucial matter. The evidence from research into instructed SLA gives massive support to the claim that concentrating on activities which help implicit knowledge (by developing the learners’ ability to make meaning in the L2, through exposure to comprehensible input, participation in discourse, and implicit or explicit feedback) leads to far greater gains in interlanguage development than concentrating on the presentation and practice of pre-selected bits and pieces of language.

One of the reasons why so many teachers are unaware of the crucial importance of implicit learning is that so few teacher trainers talk about it. Teacher trainers don’t tell their trainees about the research findings on interlanguage development, or that language learning is not a matter of assimilating knowledge bit by bit; or that the characteristics of working memory constrain rote learning; or that by varying different factors in tasks we can significantly affect the outcomes. And there’s a great deal more we know about language learning that teacher trainers don’t pass on to trainees, even though it has important implications for everything in ELT from syllabus design to the use of the whiteboard; from methodological principles to the use of IT, from materials design to assessment.

We know that in the not so distant past, generations of school children learnt foreign languages for 7 or 8 years, and the vast majority of them left school without the ability to maintain an elementary conversational exchange in the L2. Only to the extent that teachers have been informed about, and encouraged to critically evaluate, what we know about language learning, constantly experimenting with different ways of engaging their students in communicative activities, have things improved. To the extent that teachers continue to spend most of the time talking to their students about the language, those improvements have been minimal.  So why do so many teacher trainers ignore all this? Why is all this knowledge not properly disseminated?

Most teacher trainers, including Penny Ur (see below), say that, whatever its faults, coursebook-driven ELT is practical, and that alternatives such as TBLT are not. Ur actually goes as far as to say that there’s no research evidence to support the view that TBLT is a viable alternative to coursebooks. Such an assertion is contradicted by the evidence. In a recent statistical meta-analysis by Bryfonski & McKay (2017) of 52 evaluations of program-level implementations of TBLT in real classroom settings, “results revealed an overall positive and strong effect (d = 0.93) for TBLT implementation on a variety of learning outcomes” in a variety of settings, including parts of the Middle-East and East Asia, where many have flatly stated that TBLT could never work for “cultural” reasons, and “three-hours-a-week” primary and secondary foreign language settings, where the same opinion is widely voiced. So there are alternatives to the coursebook approach, but teacher trainers too often dismiss them out of hand, or simply ignore them.

How many TD courses today include a sizeable component devoted to the subject of language learning, where different theories are properly discussed so as to reveal the methodological principles that inform teaching practice?  Or, more bluntly: how many TD courses give serious attention to examining the complex nature of language learning, which is likely to lead teachers to seriously question the efficacy of basing teaching on the presentation and practice of a succession of bits of language? Today’s TD efforts don’t encourage teachers to take a critical view of what they’re doing, or to base their teaching on what we know about how people learn an L2. Too many teacher trainers base their approach to ELT on personal experience, and on the prevalent “received wisdom” about what and how to teach. For thirty years now, ELT orthodoxy has required teachers to use a coursebook to guide students through a “General English” course which implements a grammar-based, synthetic syllabus through a PPP methodology. During these courses, a great deal of time is taken up by the teacher talking about the language, and much of the rest of the time is devoted to activities which are supposed to develop “the 4 skills”, often in isolation. There is good reason to think that this is a hopelessly inefficient way to teach English as an L2, and yet, it goes virtually unchallenged.


The published work of most of the influential teacher trainers demonstrates a poor grasp of what’s involved in language learning, and little appetite to discuss it. Penny Ur is a good example. In her books on how to teach English as an L2, Ur spends very little time discussing the question of how people learn an L2, or encouraging teachers to critically evaluate the theoretical assumptions which underpin her practical teaching tips. The latest edition of Ur’s widely recommended A Course in Language Teaching includes a new sub-section where precisely half a page is devoted to theories of SLA. For the rest of the 300 pages, Ur expects readers to take her word for it when she says, as if she knew, that the findings of applied linguistics research have very limited relevance to teachers’ jobs. Nowhere in any of her books, articles or presentations does Ur attempt to seriously describe and evaluate evidence and arguments from academics whose work challenges her approach, and nowhere does she encourage teachers to do so. How can we expect teachers to be well-informed, critically acute professionals in the world of education if their training is restricted to instruction in classroom skills, and their on-going professional development gives them no opportunities to consider theories of language, theories of language learning, and theories of teaching and education? Teaching English as an L2 is more art than science; there’s no “best way”, no “magic bullet”, no “one size fits all”. But while there’s still so much more to discover, we now know enough about the psychological process of language learning to know that some types of teaching are very unlikely to help, and that other types are more likely to do so. Teacher trainers have a duty to know about this stuff and to discuss it with thier trainees.

Scholarly Criticism? Where?  

Reading the published work of leading ELT trainers is a depressing affair; few texts used for the purpose of training teachers to work in school or adult education demonstrate such poor scholarship as that found in Harmer’s The Practice of Language Teaching, Ur’s A Course in Language Teaching, or Dellar and Walkley’s Teaching Lexically, for example. Why are these books so widely recommended? Where is the critical evaluation of them? Why does nobody complain about the poor argumentation and the lack of attention to research findings which affect ELT? Alas, these books typify the general “practical” nature of TD programmes in ELT, and their reluctance to engage in any kind of critical reflection on theory and practice. Go through the recommended reading for most TD courses and you’ll find few texts informed by scholarly criticism. Look at the content of TD courses and you’ll be hard pushed to find a course which includes a component devoted to a critical evaluation of research findings on language learning and ELT classroom practice.

There is a general “craft” culture in ELT which rather frowns on scholarship and seeks to promote the view that teachers have little to learn from academics. Teacher trainers are, in my opinion, partly responsible for this culture. While it’s  unreasonable to expect all teachers to be well informed about research findings regarding language learning, syllabus design, assessment, and so on, it is surely entirely reasonable to expect the top teacher trainers to be so. I suggest that teacher trainers have a duty to lead discussions, informed by relevant scholarly texts, which question common sense assumptions about the English language, how people learn languages, how languages are taught, and the aims of education. Furthermore, they should do far more to encourage their trainees to constantly challenge received opinion and orthodox ELT practices. This surely, is the best way to help teachers enjoy their jobs, be more effective, and identify the weaknesses of current ELT practice.

My intention in this blog is to point out the weaknesses I see in the works of some influential ELT teacher trainers and invite them to respond. They may, of course, respond anywhere they like, in any way they like, but the easier it is for all of us to read what they say and join in the conversation, the better. I hope this will raise awareness of the huge problem currently facing ELT: it is in the hands of those who have more interest in the commercialisation and commodification of education than in improving the real efficacy of ELT. Teacher trainers do little to halt this slide, or to defend the core principles of liberal education which Long so succinctly discusses in Chapter 4 of his book SLA and Task-Based Language Teaching.

The Questions

I invite teacher trainers to answer the following questions:


  1. What is your view of the English language? How do you transmit this view to teachers?
  2. How do you think people learn an L2? How do you explain language learning to teachers?
  3. What types of syllabus do you discuss with teachers? Which type do you recommend to them?
  4. What materials do you recommend?
  5. What methodological principles do you discuss with teachers? Which do you recommend to them?



Bryfonski, L., & McKay, T. H. (2017). TBLT implementation and evaluation: A meta-analysis. Language Teaching Research.

Dellar, H. and Walkley, A. (2016) Teaching Lexically. Delata.

Doughty, C. (2003) Instructed SLA. In Doughty, C. & Long, M. Handbook of SLA, pp 256 – 310. New York, Blackwell.

Long, M. (2015) Second Language Acquisition and Task-Based Language Teaching. Oxford, Wiley.

Ur, P. A Course in Language Teaching. Cambridge, CUP.

Whong, M., Gil, K.H. and Marsden, H., (2014). Beyond paradigm: The ‘what’ and the ‘how’ of classroom research. Second Language Research, 30(4), pp.551-568.

ZhaoHong, H. and Nassaji, H. (2018) Introduction: A snapshot of thirty-five years of instructed second language acquisition. Language Teaching Research, in press.

Smith and Conti on Memory and SLA

In a previous post in a previous blog I’ve voiced my criticisms of Doctor Conti and his “Roll Up! Roll Up! This-is-How-To-Do-It” sales pitch to the MLT world. This post is devoted to a criticism of the new book, Memory , by Smith and Conti.

The book gives an account that serves mainly to support the authors’ view of L2 teaching, and it seriously misrepresents the little we know about the part memory plays in SLA. If we want to teach efficaciously, we need to rely on better sources than a book that misleads teachers by misrepresenting research, with the underlying aim of promoting the awful, over-prescrptive “MARS and EARS” or whatever it is methodology that they so relentlessly try to sell teachers.

Below is Steve Smith’s promotional video

Before we start, WM= Working Memory, often referred to as short-term memory. LTM = Long term memory.

The main points of this video are:

Cognitive load: “Once human brains get overloaded they can’t rehearse enough material in their WM for it to go into LTM. We want language to pass into LTM”. Note that WM is assumed to be concerned with not just remembering, but processing the input. This vital processing component of WM is not properly described and discussed here, or in the book. How is input processed in WM? Are logical inferences made? If so, how? And how do the claimed results of processing go into LTM? There are various models of how this might work, but none of them is properly considered in the book. Just for example, a coherent explanation is given by Carroll, following Jackendoff, suggesting that a failure in parsing triggers the processing.

WM: When we’re learning language and we’re getting input, everything is processed by WM. So if we understand how WM functions, there’s more chance of that knowledge passing into LTM. Note that no description of WM or how it interacts with LTM is given. The book fails to explain either the components of WM or its interaction with LTM. Nor does it give any coherent account of the process of second language learning.

Comprehensible input: “Messages that people can understand. Anything we can do to make the input understandable will reduce cognitive load” “If you’re a relaxed, happy learner, you’ll remember more”.

Declarative versus Procedural knowledge Definitions are given, but Smith says nothing about the various “interface” positions taken by SLA scholars, and nothing to justify the position he and Conti adopt.

This promotional video does nothing more than promote a book which has nothing to recommend it. The video is, perhaps, an interesting study of inter-personal relationships, and is certainly a good example of pseudo-academic chat among purveyors of snake oil.

A More Considered View

Below, I use the papers of Fabienne, et.al., (2000) and an unnamed work (Unidiss, n.a.), without proper quotation marks, with the sole aim of indicating what’s involved in a proper discussion of the issues that Smith refers to. First, here are some extracts from Fabienne, et.al. with my own additions:

Working memory (WM) refers to a limited capacity system responsible for the temporary storage and processing of information while cognitive tasks are performed. The multi component model proposed by Alan Baddeley and Graham Hitch (Baddeley & Hitch, 1974; Baddeley, 1986) represents the most extensively investigated and the best articulated theoretical account of working memory. It consists of a modality-free controlling central executive which is aided by two slave systems ensuring temporary maintenance of verbal and visuospatial information: the phonological loop (PL) (composed of a phonological store and an articulatory rehearsal system) and the visuospatial sketchpad. The PL has remained the most studied aspect of WM.

Some aspects of Baddeley’s working memory model have recently been questioned, especially the relationships between working memory and long-term memory. According to Baddeley (1996), working memory is viewed as a gateway between sensory input and long-term memory. In particular, working memory is considered to be closely involved in the learning of novel information. In this perspective, a vast amount of data have suggested that the long-term acquisition of phonological forms of new words requires the integrity of the phonological store (e.g. Barisnikov, Van der Linden, & Poncelet, 1996). Several studies have led to question this “gateway” view, especially by demonstrating the existence of long-term memory effects in working memory (span) tasks.

Logie (1996) suggests that working memory operates not as a gateway between sensory input and long-term memory but as a workspace. In this view, the storage components of working memory (the phonological loop and the visuospatial sketchpad) are not input buffers but rather they serve as temporary buffers for the information that has yet to be processed or is about to be rehearsed overtly. Thus, information that has been recently presented to the senses will activate the whole corresponding traces in long-term memory (visual, phonological, semantic, etc.), which then become available for temporary activation in the different components of working memory. This model furnished an explanation of the intervention of long-term memory in span tasks by suggesting that the performance depending on the phonological loop would be increased if semantic and visual information are simultaneously available for the other components of working memory.

In conclusion, there exist different contrasted conceptions of the relationships between working memory and long-term memory, as well as between working memory and language processing.

Contrary to Baddeley’s view (as well as Logie’s adaptation), R.C. Martin and Romani (1994) suggested that verbal working memory is not a specialized subsystem dedicated to short-term memory storage, and separate from the language system but rather draws on the operation and storage capacities of a subset of components involved in language processing.

Next, here are extracts from the dissertation, Unidiss (n.a.).

Working memory was first proposed by Baddeley and Hitch (1974) as a temporary store for incoming information, where this information could also interact with long term memory (e.g. for the purposes of language comprehension) and then be transferred to long term stores. This model consisted of three subsystems; the phonological loop (PL), the visuospatial sketchpad (VSS) and the supervising central executive. These subsystems have informational limits and these limits vary between individuals. It comprises a storage component and a processing component (Baddeley, 2000).

Figure 2 – Baddeley Model of Working Memory from Baddeley (1986)

The PL component is critical to SLA and is said to be comprised of a phonological store and an articulatory control process. For acoustic information to enter the phonological store, it must first be encoded by the auditory system, an ability called phonological coding ability. After it has been deciphered, the phonological store can hold 1-2 seconds of it, while the articulatory control process allows this information to be ‘refreshed’ and stay in the store longer through repetition. This system is also referred to as Phonological Short-Term Memory (PSTM).

The VSS is also involved in the learning and reading of written language. The model also includes the overseeing Central Executive (CE), which coordinates between the slave systems and controls information going in and coming out of them from long term memory.

Phonological Loop and Vocabulary Acquisition

The first evidence of the working memory system being involved in language acquisition came from studying patient PV (Baddeley et al., 1988). Patient PV had suffered from a left hemisphere stroke that resulted in impaired short-term memory. There was no impaired visual memory but an impaired phonological loop; the patient could only recall 2-3 digits compared to the average of 7 digits ± 2. The first experiment showed that the patient’s ability to learn pairs in her native language was intact whereas the second showed that she was not able to pair a word in her native language to a second word in a foreign language that she had studied in the experiment previously. These results suggested the learning of new words in native language involves semantic learning whereas the PL plays a key role in SLA.

These findings were expanded on by Papagno and Vallar (1992) through looking specifically at the link between PSTM and the learning of novel vocabulary. The 24 subjects were read out two lists of paired words where stimuli were the same syllable length but either phonologically dissimilar or similar (E.g. ‘volpe and segno’ compared to ‘tetto and berba’). The other two lists, while matched for syllable length, paired a word with a non-word (a word generated by altering an existing word by a single word, tetto and zibro). The lists that were phonologically similar took longer to learn, for both non-word lists and word lists. A further experiment disrupted the articulatory control process (ACP) by preventing rehearsal, preventing the information from being stored in the phonological loop. Learning for the non-word lists dramatically reduced but not for actual word lists. This suggests that for novel foreign words, PSTM is important but learners are using other cognitive tools to learn words of their native language. The third experiment used lists with 2 or 4 syllable long words and non-words. They found that the longer syllable non-words have significantly lower rates of recall compared to comparatively long words. They hypothesised that this was because the ACP was impaired so the non-words could not be repeated. The strengths of this study were that they used a varied number of methods to look at the phonological loop but a limitation that was acknowledged that they could not control for the use of memory strategies or differences in attentional abilities. However, from this evidence, it is evident that at least initially, the rate of vocabulary learning is linked to the strength of the phonological loop.

PSTM and the Acquisition of Grammar

A language’s grammar is the rules that allow the construction of words into meaningful sentences. Ellis (1996a) argues that after a sufficient bank of L2 phonological labels has been acquired, then the same abstraction processes that have tuned to theL2 phonological system are able to tune the grammatical system. The system becomes attuned to L2 word order and associations after building a lexical foundation. It naturally follows then that the phonological loop would also be involved in the acquisition in grammar.

To understand how grammar is forged from WM, grammar rules across many languages need to be studied, both familiar and non-familiar to the individual. This would help us elucidate how the language learning system abstracts rules from PSTM.

Limitations to the Role of PSTM in Second Language Acquisition

Results from studies investigating the link between PTSM and SLA are not always consistent and therefore there are loud critics of the theory that WM is the gateway into language acquisition. For example, Juffs (2004) argues that the role of WM in SLA is overstated and that it should be considered within a larger toolkit of cognitive abilities that are used to acquire languages.

The Role of the Central Executive in Second Language Acquisition

Having considered SLA in terms of the phonological loop, a slave system, it now is time to consider the master; the central executive. Research into the CE has been neglected compared to the rest of WM model due to its complex nature. Despite this lack of research, as the CE coordinates the three slave systems of WM it must be involved in language acquisition. There is still great discussion on if it works as one entity or it itself has overlapping subsystems.

The Central Executive and Previous Language Experience

It is likely that working memory is critical to second language acquisition but that different components have different functions at different points. Initially, it may be that it is the phonological short-term memory that is responsible for successful language learning, but to attain higher levels of proficiency this is then supported by the central executive. However, it is also possible that when considering SLA, it could be a completely different cognitive ability outside of WM that is underlying late stage SLA success.

The Central Executive and General Intelligence

It has been long argued by many critics that working memory and intelligence may actually be different measures of the same cognitive abilities. Consequently, this would mean that the central executive’s role in SLA is the same as that of general intelligence (Jensen, 1998; Kyllonen, 2002; Stauffer et al., 1996). It could be that it is in fact general intelligence that is at the heart of successful language acquisition and that it mediates this through the central executive. However, there are loud voices in this debate that argue that while the two are very significantly correlated (estimates from .995 in Stauffer et al., (1996) to .479 in (Ackerman et al., (2005)), they are separate constructs entirely (Unsworth and Spillers, 2010).


Conclusion 1; PTSM IS Involved and Important in SLA

In conclusion, from analysing the evidence through this dissertation, the PSTM is certainly central to successful and effective initial SLA. The evidence from meta-analyses particularly is convincing (Li, 2015, p. 20; Linck et al., 2014) at showing that the strength and robustness of this relationship, especially in the early stages of SLA.

However, there is doubt, and rightly so, that is the sole ‘gateway to second language acquisition’ that it has long been heralded to be. As (Juffs, 2004) rightly points out, the evidence is inconclusive but this may be linked to the wide range of methodologies used to study working memory. Another problem is that tasks of working memory often include a combination of storage and processing demands. (Linck et al., 2014) found from their meta-analysis, that the processing component of WM was most strongly correlated with L2 outcomes. Further research should focus at looking at a wider array of measures that capture other cognitive abilities and consider roles specific components of working memory are playing to address the gap of research around the central executive.

Conclusion 2; The Central Executive is Increasingly Important in Later Stages of SLA

Therefore, with this gap of research in mind, the second conclusion is that WM may influence and be related other cognitive abilities needed for SLA and that this influence is exerted through the CE. Linck et al (2014) found that high-level attainment was indeed related to working memory (namely PSTM) but also to associative learning and implicit learning in a sample size of 522 participants. Having previously considered how WM relates to other factors such as fluid intelligence and attention this suggests that SLA is the coordinated effort of a network of different nodes that are important at different stages of the process. This coordination is likely facilitated by the central executive component of working memory. For example, it seems that PSTM is initially very important in establishing the foundations of L2, but as these are built, long term representations become more important in the high proficiency learner later in SLA. The interaction of incoming phonological information and PLTM is orchestrated by the CE and data suggests that it plays a key role in SLA.

Moreover, P. Skehan (1986) used cluster analysis technology to demonstrate different successful profiles of language aptitude with varying abilities. For example, one group of learners had high linguistic analysis ability and average memory while another had good memories but average linguistic analytic ability. The last group had average aptitudes for SLA but were still successful. This suggests that as well as different abilities contributing to successful SLA, varying combinations of these abilities could still lead to success.


There are important discussions going on about the roles of working memory and long term memory in the SLA process. All sorts of issues are involved, and none of them is properly considered in this book. Serious consideration of the dynamic relationship between working memory, long term memory and SLA must start by defining the constructs involved. Once we sort out whether or not WM is a gate-keeper, and how WM and LTM work together, we must then reconsider the constructs of “input”, and “intake” in SLA. As I’ve explored in other posts, particularly those on Schmidt’s Noticing hypothesis and Carroll’s work, my hunch is that Carroll is right when she says

The learner initially parses an L2 using L1 parsing procedures and when this inevitably leads to failure, acquisition mechanisms are triggered and i-learning begins. New parsing procedures for L2 are created and compete with L1 procedures and only win out when their activation threshold has become sufficiently low. These new inferential procedures, adapted from proposals by Holland et al. (1986), are created within the constraints imposed by the particular level at which failure has taken place. This means that a failure to parse in PS [Phonological Stuctures], for example, will trigger i-learning that works with PS representations that are currently active in the parse and it does so entirely in terms of innately given PS representations and constraints, hence the ‘autonomous’ characterisation of AIT (Holland et al. 1986, Carroll 2001: 241–2).


Fabienne, C., Van der Linden, M. and Poncelet, M. (2000) Working memory, long-term memory and language processing : Issues and future directions. Brain and Language,  71, 46-51.

UKDiss (n.a.) The Role of Working Memory in Second Language Acquisition. Downloaded, 15th April, 2021 from https://ukdiss.com/examples/0285120.php

Heart and Parcel

Heart and Parcel was founded in 2015 by Clare Courtney and Karolina Koścień with the aim of supporting people ​learning English in their local communities and “forging connections by developing English language and communication skills through the medium of food”. They run a variety of projects “which use food and cooking as a way for participants to gather together, connect and to share their past stories, experiences and lives with others, whilst practising and developing their English language skills”.

An Example

One of their most popular courses is “From Home to Home”, an online course which teaches English by giving a series of cooking classes. The “Main Class”, Cooking and English, is given every week, followed, two days later, by a “Post-class Discussion Class”, where students practice their English with other students and qualified teachers. “Extra Study” is provided by “Homework” (vocabulary and grammar exercises about the food and recipe from the main class) and a “WhatsApp Discussion Group” (share recipes, videos, photos of food and communicate in English with other students and qualified teachers).

More Good Stuff

Additionally, Heart and Parcel run fundraising public food events such as supper clubs, markets, catering, and private workshops. They also publish a collaborative cookbook “working with participants to share recipes and stories from their communities, cultures and lives”.

And it doesn’t stop there. They explore “learning opportunities through food” by looking at budgeting, healthy meals, using food for social change, employability and community cohesion through communal eating. As Clare told me, “Most importantly, we have found that building connections and networking has been wholly beneficial for the learners, being exposed to different settings, contexts with us (our learners can volunteer with us to do catering events, market stall cookalongs, presentations) thus creating further opportunities to meet people who are more within their interests, line of work or study”.  They are also committed to “learner progression through to paid positions in our project and the opportunity to teach and to develop the skills of newer learners who enter our programme.”

I won’t bother to say why I think this is such an inspirational project because I think it speaks for itself. It will be described and discussed in the forthcoming book on ELT by Mike Long and me, which I’ve mentioned a few times already. The best thing you can do is visit the website.

My thanks to Clare for taking the time to tell me about the thinking behind the project and how it’s developing.

The effects of multimodal input on second language learning

There’s a sudden buzz in SLA research – reports of studies on multimodal input abound, and a special issue of the Studies in SLA journal (42, 3, 2020) is a good example.  Another is the special issue of The Language Learning Journal (47, 2019). Below is a quick summary of the Introduction to the SSLA special issue. I’ve done little more than pick out bits of the text and strip them of the references, which are, of course, essential in giving support to the claims made. If you don’t have access to the journal, get in touch with me for any of the articles you want to read.


Mayer’s (2014) cognitive theory of multimedia learning states that learning is better when information is processed in spoken as well as written mode because learners make mental connections between the aural and visual information provided there is temporal proximity. Examples in the domain of language learning are

  • storybooks with pictures read aloud,
  • audiovisual input,
  • subtitled audiovisual input,
  • captioned audiovisual input,
  • glossed audiovisual input.

What these types of input have in common is the combination of pictorial information (static or dynamic) and verbal input (spoken and/or written). Most of these input types combine not two but three sources of input:

  1. pictorial information,
  2. written verbal information in captions or subtitles, or in written text, and
  3. aural verbal input.

It could be argued that language learners might experience cognitive overload when engaging with both pictorial and written information in addition to aural input. However, eye-tracking research has demonstrated that language learners are able to process both pictorial and written verbal information on the condition that they are familiar with the script of the foreign language.

In addition to imagery, there are other advantages inherent in multimodal input and audiovisual input in particular. Learners need fewer words to understand TV programs compared to books. Webb and Rodgers (2009a, 2009b) have put forward knowledge of the 3,000 most frequent word families and proper nouns to reach 95% coverage of the input. However, the lexical coverage figures for TV viewing have recently been found to be lower, so the lexical demands are not as high as for reading (knowledge of the 4,000 most frequent word families for adequate comprehension and 8,000 word families for detailed comprehension. Rodgers and Webb (2011) also established that words are repeated more often in TV programs than in reading, especially in related TV programs, which is beneficial for vocabulary learning. Another advantage is the wide availability of audiovisual input using the Internet and streaming platforms. It can, thus, easily provide language learners with large amounts of authentic language input (Webb, 2015). Finally, language learners are motivated to watch L2 television, as has been well documented in surveys on language learners’ engagement with the L2 outside of the school.


Previous research into language learning from multimodal input has focused on three main areas: comprehension, vocabulary learning, and, to a lesser extent, grammar learning. A consistent finding in this area is that audiovisual input is beneficial for comprehension, in particular when learners have access to captions. Captions assist comprehension by helping to break down speech into words and thus facilitating listening and reading comprehension. Crucially, a unique support offered to learners’ comprehension by multimodal input is imagery. Research into audiovisual input has shown that it can work as a compensatory mechanism especially for low-proficiency learners.

The bulk of research into multimodal input has focused on vocabulary learning. A seminal study on the effect of TV viewing on vocabulary learning is Neuman and Koskinen’s 1992 study. They were among the first to stress the potential of audiovisual input for vocabulary learning. It was not until 2009 that the field of SLA started to pay more attention to audiovisual input. Two key studies were the corpus studies by Webb and Rodgers (2009a, 2009b), which showed the lexical demands of different types of audiovisual input. They argued that in addition to reading, audiovisual input may also be a valuable source of input for language learners. Since then, the field of SLA has witnessed a steady increase in the number of studies investigating vocabulary learning from audiovisual input. While most  research into audiovisual input focused on the efficacy of captions, fewer studies  focused on noncaptioned and nonsubtitled audiovisual input. Research has also moved from using short, educational clips to using full-length TV programs.  Finally, in addition to studying the effectiveness of multimodal input for vocabulary learning, research has also started to study language learners’ processing of multimodal input (e.g., looking patterns of captions or pictures) by means of eye-tracking. Together, there seems to be robust evidence that language learners can indeed pick up unfamiliar words from multimodal input and that the provision of captions has the potential to increase the learning gains.

Research into the potential of multimodal input has been gaining traction, but the number of studies is still limited and mainly confined to vocabulary learning. Now that research into multimodal input is starting to broaden its focus to different aspects of learning as well as its research techniques, the present issue provides an up-to-date account of research in this area with a view to include innovative work and a range of approaches.

The special issue pursues new avenues in research into multimodal input by focusing on pronunciation, perception and segmentation skills, grammar, multiword units, and comprehension. In addition, it extends previous eye-tracking by investigating the effects of underresearched pedagogic interventions on learners’ processing of target items, target structures, and text. The studies nicely complement each other in their research methodologies and participant profiles. The special issue comprises six empirical studies and one concluding commentary.

  • Different types of input (TV viewing with and without L1 or L2 subtitles, reading-while-listening, reading, listening);
  • Different types of captioning (unenhanced, enhanced, no captioning);
  • Different components of language learning (single words, formulaic sequences, comprehension, grammar, pronunciation);
  • Different mediating learner- and item-related factors (e.g., working memory, prior vocabulary knowledge, frequency of occurrence);
  • Different learning conditions (incidental learning, intentional learning, experimental and classroom-based) and time conditions (short video clips, full-length TV programs, extensive viewing);
  • Different research tools (eye-tracking, productive and receptive vocabulary tests, comprehension tests).

I should say that I’ve excluded the parts on grammar learning. Here’s an extract:

Research into grammar learning through multimodal input is very scarce. More recent studies involving captions and grammar in longer treatments have provided evidence of positive benefits for L2 grammar development in adults, especially when captions are textually enhanced. However, results have not been similarly positive for all target structures, suggesting the influence of other factors such as the structure-specific saliency of a grammar token.

This sudden surge of interest in multimodal input is obviously, in part anyway, a response to the growth of on-line teaching forced on us by the Covid 19 pandemic. To me, it looks like a very promising development, particularly as a possible answer to the question of how to tackle the need to encourage inductive “chunk” learning.

Problems in SLA: Is Emergentism the answer to them?

Mike Long’s (2007) book Problems in SLA is divided into three parts: Theory, Research, and Practice.

Part One

In chapter 1, “Second Language Acquisition Theories”, Long reviews some of the many approaches to theory construction in SLA and suggests that the plethora of SLA theories obstructs progress. In chapter 2, Long suggests that culling is required, and he uses Laudan’s “problem-solving” framework (e.g., Laudan, 1996) as the basis for an evaluation process. Briefly, theories can be evaluated by asking how many empirical problems they explain, giving priority to problems of greater significance, or weight. Long suggests that among the weightiest problems in SLA are age differences, individual variation, cross-linguistic influence, autonomous interlanguage syntax, and interlanguage variation.

Part Two 

Chapter 3 deals with “Age Differences and the Sensitive Periods Controversy in SLA”. Why do the vast majority of adults fail to achieve native-like proficiency in a second language? Long argues that maturational constraints, or “sensitive periods” explains this problem. Chapter 4 deals with recasts. As we know, recasts are a controversial issue, but they play an important role in Long’s focus on form. Long gives his usual careful review of the literature on research so far and concludes that recasts facilitate acquisition “without interrupting the flow of conversation and participants’ focus on message content” (p. 94).

Part Three

Chapter 5 “Texts, Tasks and the Advanced Learner”, discusses Long’s version of TBLT. Long claims that his TBLT is superior to “the traditional grammatical syllabus and accompanying methodology, or what I call “focus on forms” (p. 121) because it respects, rather than contradicts, robust findings in SLA. Long gives particular attention to the methodological principles of “focus on form” (reactive attention to form while attention is on communication), and “elaborated input” (use elaborated rather than simplified texts). Finally, chapter 6, “SLA: Breaking the Siege”, responds to three “broad accusations made against SLA research in recent years”. The charges are “sociolinguistics naiveté, modernism, and irrelevance for language teaching”. Long finishes with suggestions on how the siege might be broken.


The book packs a powerful punch. The references section is impressive (as usual); chapters 3, 4, and 5 are still very informative; and chapters 1, 2, and 6 are still a cogently argued case for a critical rationalist approach to SLA research and its application to ELT. A slight niggle is that Long’s discussion of theory construction and evaluation in chapters 1 and 2 is not entirely consistent with the rest of the book. There’s a possible conflict between chapters 3 and 4  – the claim that SLA is maturationally constrained (a view usually associated with “nativist” theories) sits uneasily with the claims made for recasts – and the absence of any mention of the interaction hypothesis adds a bit more doubt about exactly what Long himself regards as the best theory of SLA. Such doubts are dealt with in his (2015) book Second Language Acquisition and Task-Based Language Teaching.

 Chapter 3 describes “A Cognitive-Interactionist Theory of Instructed Second Language Acquisition (ISLA”. Note that this is a theory of Instructed SLA, where, Long says, “necessity and sufficiency are less important than efficiency. Provision of negative feedback, for example, might eventually turn out not to be a relevant factor in a theory of SLA, as argued persuasively by Schwartz (1993), but its empirical track record, to date, as a facilitator of rate and, arguably, level of ultimate attainment makes it a legitimate component – in fact, a key component – of a theory of ISLA”.  Long claims that his “embryonic” theory addresses empirical problems concerning (i) success and failure in adult SLA, (ii) processes in IL development, and (iii) effects and non-effects of instruction. The explanation is based on an emergentist, or usage-based (UB) theory of language acquisition:

A plausible usage-based account of (L1 and L2) language acquisition (see, e.g., N.C. Ellis 2007a,b, 2008c, 2012; Goldberg & Casenhiser 2008; Robinson & Ellis 2008; Tomasello 2003), with implicit learning playing a major role, begins with initially chunk-learned constructions being acquired during receptive or productive communication, the greater processability of the more frequent ones suggesting a strong role for associative learning from usage. Based on their frequency in the constructions, exemplar-based regularities and prototypical morphological, syntactic, and other patterns – [Noun stem-PL], [Base verb form-Past], [Adj Noun], [Aux Adv Verb], and so on – are then induced and abstracted away from the original chunk-learned cases, forming the basis for attraction, i.e., recognition of the same rule-like patterns in new cases (feed-fed, lead-led, sink-sank-sunk, drink-drank-drunk, etc.), and for creative language use (Long, 2015, pp 48-49).

I personally don’t find Ellis’ usage-based account plausible, and I still can’t quite get used to the fact that Long went along with it.  I console myself with the fact that Long didn’t join the Douglas Fir group, and that he retained his commitment to the importance of sensitive periods and interlanguage development. Furthermore, warts and all, I think Long’s book on TBLT is the best book on ELT ever written. Having said all that, I want to go back to Long’s concern for theory construction in SLA and suggest that they don’t justify his siding with Nick Ellis “and the UB (not UG) hordes”.

Laudan’s aim was to reply to criticism of Popper’s “naïve” falsification criterion. He tried to improve on the work of Lakatos (who had the same aim of defending Popper’s falsification criteron) by suggesting, firstly, that science is to do with problem-solving, and secondly, that science makes progress by evolving research traditions. This concern with research traditions is at the heart of Laudan’s endeavor, and I don’t think Long sufficiently recognizes its importance. Laundan talks about research traditions in science; Long wants to talk about theories of SLA. In my opinion, Laudan gives a poor account of research traditions in science, and Long makes poor use of Laudan’s criteria for theory evaluation.

Laudan says that the overall problem-solving effectiveness of a theory is determined by assessing the number and importance of empirical problems which the theory solves and deducting therefrom the number and importance of the anomalies and conceptual problems which the theory generates (Laudan, 1978: 68). In a later work, Laudan (1996) develops his “problem-solving” approach and offers a taxonomy.  He suggests, first, that we separate empirical from conceptual problems, and that as far as empirical problems are concerned, we distinguish between “potential problems, solved problems and anomalous problems.” ‘Potential problems’ constitute what we take to be the case about the world, but for which there is as yet no explanation. ‘Solved problems’ are that class of putatively germane claims about the world which have been solved by some viable theory or another.  ‘Anomalous problems’ are actual problems which rival theories solve but which are not solved by the theory in question (Laudan, 1996: 79). As for conceptual problems, Laudan lists four problems that can affect any theory.

Laudan claims that this “taxonomy” helps in the relative assessment of rival theories, while remaining faithful to the view that many different theories in a given domain might well have different things to offer the research effort. Laudan argues that it is rational to choose the most progressive research tradition, where “most progressive” means the maximum problem-solving effectiveness.  Note first that Laudan refers to the most progressive research tradition, not theory. But the main problem is how we assess the problem-solving effectiveness of rival research traditions. In the end, we will be forced to compare different theories belonging to different research traditions, and then, how does one count the number of empirical problems solved by a theory?  For example, is the “problem of the poverty of the stimulus” to be counted as one problem or several?  In principle the number of problems could be infinite. And how are we to assign different weightings to theories? How much weight should we give to Schmidt”s Noticing Hypothesis, and how much to Long’s Interaction Hypothesis, for example? Laudan’s inability to suggest how we might go about enumerating the seven types of problems in his taxonomy that are dealt with by any given research tradition (itself not a clearly-defined term), or how these problems might then be weighted, seems a fatal weakness in his account.

Even if we ignore this weakness, I don’t think Long makes a persuasive case for the UB research tradition he favours. In the field of linguistics, the nativist, UG-led research tradition has an impressive record; I can’t think of any way that the UB theories of N.C. Ellis, Goldberg & Casenhiser, Robinson & Ellis, and Tomasello can be made to score higher than the UG-based theories of Chomsky (1959), White (1989), Carroll (2001), and Hawkins (2001), for example. I’ve argued elsewhere in this blog against the emergentist view, usually citing Eubank and Gregg (2002) and Gregg (2003) Let me just summarise one point Gregg makes here.

For emergentists, SLA is a matter of associative learning: on the basis of sufficiently frequent pairings of two elements in the environment, one abstracts to a general association between the two elements. The environment provides all the necessary cues for these associations to form. Gregg (2003) gives this example from Ellis: ‘in the input sentence “The boy loves the parrots,” the cues are: preverbal positioning (boy before loves), verb agreement morphology (loves agrees in number with boy rather than parrots), sentence initial positioning and the use of the article the)’ (1998: 653). Gregg asks ‘In what sense are these ‘cues’ cues, and in what sense does the environment provide them?’ The environment can only provide perceptual information, for example, the sounds of the utterance  and the order in which they are made. Thus, in order for ‘boy before loves’ to be a cue that subject comes before verb, the learner must already have the concepts SUBJECT and VERB. According to Ellis, if SUBJECT is one of the learner’s concepts, that concept must emerge from the input. But how can it?  How can the learner come to know about subjects or agreement in English? What ‘cues’ are there are in the environment for us to learn the concept SUBJECT so that later on we can use that concept to abstract SVO from other input sentences? As Gregg (2003, p. 120) puts it:

Not only is it unclear how ‘preverbal position’ could be associated with ‘clausal subject’ or ‘agent of verb’, it is also not clear that these should be associated (Gibson, 1992): For instance, in sentences like ‘The mother of the boy loves the parrots’ or ‘The policeman who followed the boy loves the parrots,’ ‘the boy’ is preverbal but is neither subject nor agent. In short, there is no reason to think that ‘comes before the verb’ is going to be useful information for a learner or a hearer, in the absence of knowledge of syntactic structure. But once again, the emergentist owes us an explanation of how syntactic structure can be induced from perceptual information in the input.

Likewise, it does not make sense to say that learners “notice” formal aspects of the language from the input – grammar cannot, by definition, be “noticed” from perceptual information in the environment.

I don’t doubt that Mike would have made short work of these criticisms had I managed to put them to him. I recently asked him if we could discuss Chapter 3 on Skype, but he was already too ill.  While there are, in my opinion, almost insurmountable problems for an empiricist, usage-based theory of language learning to overcome, and while it follows that I don’t think Long resolves them, in his (2015) book SLA & TBLT, Long uses Laudan’s “Problems and Explanations” framework to address four problems, rather than present any full theory of SLA.  He does so with his usual scholrship, and he is absolutely clear about the most important issue facing us when it comes to designing courses of English for speakers of other languages: learning a new language is “far too large and too complex a task to be handled explicitly” …. “implicit learning remains the default learning mechanism for adults”.  You can see his quote in context in this post:  Mike Long: Reply to Carroll’s comments on the Interaction Hypothesis. 


Carroll, S. (2000). Input and evidence: The raw materials of second language. Amsterdam, Benjamins.

Chomsky, N. (1959). Review of B.F. Skinner Verbal behavior. Language 35, 26–8.

Eubank, L. and Gregg, K. R. (2002) News Flash–Hume Still Dead. Studies in Second Language Acquisition, 24, 237-24.

Gregg, K.R. (2003) The state of emergentism in second language acquisition, Second Language Research 19,2, 95–128.

Hawkins, R. (2001). Second language syntax: A generative introduction. Oxford: Blackwell.

Laudan, L. (1978) Progress and its problems: Towards a theory of scientific growth. University of California Press.

Laudan, L. (1996)  Beyond positivism and relativism: Theory, method, and evidence. Oxford and New York: Westview Press.

White, L. (1989). Universal Grammar and second language acquisition. Amsterdam: Benjamins.

Mike Long

Mike died on Sunday morning. He will be greatly missed by the applied linguistics academic community, by the anarcho-syndicalist movement, by his wide circle of friends all over the world, and by the hundreds, including me, who owe their academic careers to his generous help.

Over a period of more than forty years, Mike had a massive influence on developments in psycholinguistics, instructed SLA, and TBLT. His CV is testament to his contributions to the field; it includes a huge volume of published material, progressive editorial work and teaching that changed the lives of so many. Mike was a brilliant, meticulous, scrupulously honest academic, who had a lifelong commitment to “L’education Integrale”, as he called it in his book SLA and Task-based Language Teaching. He begins Chapter 4: “Education of all kinds, not just TBLT … serves either to reserve or challenge the status quo, and so is a political act, whether teachers and learners realize it or not”.  Mike combined the highest standards of intellectual rigor with a sustained fight against the status quo. He was a regular contributor to the Anarcho-Syndicalist Review, joined campaigns and picket lines fighting for the rights of intellectual workers, and made many donations  to support the anarchist movement.

Mike was a dangerous man to sit next to, in a conference plenary, a committee meeting, a seminar, wherever eruptions of laughter were frowned on. His stage-whispered asides were often just too funny to keep the laughter in, and even in restaurants, I was once asked by waiters to keep the noise down, as Mike, crying with laughter himself, told one of his funny stories. He was a delight to be with, and I’m glad that he was so fond of Cataluña, which he visited as often as he could. He and his partner Cathy Doughty called their son Jordi, his favorite football team was FC Barcelona, he loved Priorat wines, and we often went up to the cemetery in Montjuic to pay our respects to Buenaventura Durruti and his fallen comrades.

Among lots of projects he gave his support to, Mike helped Neil McMillan and I put together a teacher education course on TBLT. He thoroughly approved of the SLB Cooperative, so he happily gave us advice and materials, recorded presentations, and took part in webinars with participants on the course.

Mike and I were in the middle of writing a book about the ELT industry when he got the sudden news of his illness. He spent the last few months of his life working on the book, and, helped by Cathy, we got nine of the fourteen chapters more or less done. Cathy and I will now try to finish it.

Mike was the best teacher I ever had, and a wonderful, generous friend. I’ll miss him terribly.

P.S. There’s a great website here in honour of Mike: IN MEMORY | Mike Long (wixsite.com 

Life on Twitter

I closed my Twitter account a few months ago because I felt Twitter was mostly a waste of time. An exchange with JB Gerard, where he accused me of racism, was the trigger I needed. I opened another account under the name of Benny28908382 to watch what was happening. On Feb 18th., I “broke cover” and joined in the discussions. I replied to a tweet by ELT’s super salesman, Dr Gianfranco Conti (international keynote speaker, professional development provider, winner of the 2015 Times Educational Supplement egg and spoon race, etc., etc.) who made one of his typically crass pronouncements about SLA. Here it is, with what followed.

Part One


Note the slide into personal attack, and I recognise that I provoked it. But my remarks were not meant as an attack on Tim; I was talking to Tim as a scholar, and I referred to what I saw as his lack of scholarship on this occasion. My interest was in questioning Conti’s confident edict, and I was surprised at Tim’s responses. Neither of us gave the best expression of our views, but anyway, we hadn’t, till the last bit, lost sight of the preposterous claim of Dr. Conti. We were talking about different theories of SLA. I find Tim an interesting man to talk to, or at least, I did, and I was certainly not treating him as an idiot. I see (or rather, I saw) Tim as someone who needs encouragement to study more. It would be in, IMHO, a waste of talent if Tim satisfied himself with what he’s learned so far about the fashionable usage-based view of SLA, and failed to delve deeper. My suggestion that he needed to read more was honestly well-intentioned, but I accept that three of my replies were a bit harsh.

On we go.

Part 2 


Note how Tim dodges the issue. “I’m more talking about using an L2 than learning it”, he says. We were talking about learning an L2, were we not! And note that Tim’s final tweet shows a poor understanding of unconscious learning. Well never mind, Tweets are dashed off, and we can’t expect the same rigour found in texts where the author has time to express themself more clearly.

The Fall out 

The above exchange led to this, the following day:


See what’s happened? Tim discovers that Benny is Geoff, and that’s enough for him to turn things into an attack on Geoff, all discussion of SLA long forgotten. The real content, what little there was of it, has been lost. Never mind that Conti’s original statement is wrong; never mind that Tim can’t explain working memory, never mind that efficacious ELT is at stake. No. The important thing now, for Tim, is to show that Geoff’s opinions are as nothing compared to the offence he gives to good people like himself, he the perfect representative of the good folk who make up the wonderful community of Twitter ELT.

Tim, cute, charming Tim, the darling of the rainbow warriors, the apotheosis of the young whelp and mediocre dancer, responds to “being treated like an idiot” by a typical Twitter hatchet job that shows a shameful disregard for the truth. In a way that would make hardened Daily Mail journalists cringe, he concocts a story aimed at discrediting me. Tim quite falsely states that I closed my Twitter account because I’d been revealed as a racist and that I later opened a scockpuppet account with the intention of being abrasive to people I don’t like. He accuses me of malpractice, and of manipulating social media in my fiendish drive to continue making “incredibly rude”, “obnoxious” attacks on carefully-selected targets.

Tim’s response to our exchange is that of a preening, self-righteous prig. He brings the dregs with him. Among messages of support, Hugh Dellar, with his usual, ironic disdain for context, suggests that the real cause of my criticisms of him is repressed lust, while Stirling Bannock (not his real name), vows never to read a word I say. May they all be happy together.

Dellar on Grammar

Dellar’s latest blog post is Part Nine of his views on ELT. It’s called Part Nine: the vast majority of mistakes really aren’t to do with grammar!

I’ll summarise it and then suggest that

1. Most mistakes in the oral and written production of students of English as an L2 are to do with grammar

2. Dellar’s view of how people learn English as an L2 is badly-informed and incoherent

3. Dellar’s approach to teaching English as an L2 is mistaken.

Summary of Dellar’s blog post

When he was younger, Dellar believed that the root cause of student error was essentially grammatical. It took him “quite some time” to realise that since students only did tasks that focused on the production of grammatical structures, it was unsuprising that their errors were grammatical. Dellar comments that “to extrapolate out from such experiences and to then believe that mistakes are mostly down to grammar is a fallacy of the highest order”.

To become more aware of the real issues that students face when learning English, Dellar says that teachers need to change tack and focus on tasks which require the production of language outside the narrow confines of what are essentially grammar drills of varying kinds.  Unfortunately, during these “freer slots”, teachers still pick up on grammar. “This is what we’re most trained to focus on, and the way most of us are still trained to perceive error, and old habits die hard”.

Dellar then discusses how he and his co-author and colleague, Andrew Walkley started using Vocaroo (an online audio recorder) to record fifty chunks / collocations and send the link to all ther students. “They’d then write them down as best they could, like a dictation; we’d send the original list and students would then write examples of how they think they might actually use each item – or hear each being used. These were emailed over and we’d correct them, comment on them, etc.”.  To their dismay, Dellar and Walkley found that words that they felt they had “explained well, given extra examples of, nailed, as it were”, would come back “half digested, or garbled, or in utterly alien contexts with bizarre co-text”.

Dellar explains these disappointing results as follows:

What is really going on is that the new language is somehow slowly getting welded awkwardly onto the old; meanings in the broadest sense are largely understood, but contexts of use not yet clearly grasped.

He goes on:

This should not surprise, of course. The fact that students have encountered new items in class, seen them once or twice or even three times in some kind of context, possibly translated them and more or less grasped their meanings is simply evidence of the fact that they’ve not yet been primed anywhere near sufficiently. For fluent users who’ve grasped new items, there’s been encounter after encounter after encounter, with item and with co-text in context; for learners, this process has only just begun, and as a result the odds of priming from L1 being brought over when it comes to using the new items creatively is very high indeed.

It also tempers the expectation one should have of the power and value of correction. I’m under no illusion that the detailed comments and extensive correction / recasting I carry out on student efforts (see below) will somehow magically result in correct and fluent use henceforth. Rather, I see my work here simply as further efforts to prime and to draw attention to glitches, misconceptions, perennial misuses and so on; in short, I am merely a condensed and rather more focused part of the priming process.

What else you realise is the sheer futility of trying to explain much error through the filter of grammar. Take the first sentence shown below – The area has been deserted after a huge flooding 3 years ago. What’s a dogged grammar hound to do here? Point out that if we’re using AFTER when talking about something that happened three years ago, we’d generally use the past simple, so if we want to use the present perfect, it’d be better to use SINCE? If we’re talking about flooding, it’s usually uncountable and thus kill the A? Even if you were to do this, you’d still be left with: The area has been deserted since huge flooding three years ago, which still sounds very stilted and forced. Often, the only real solution to the morass of oddness these sentences throw one into is rather severe reworking, with options sometimes given, questions sometimes asked, and explanations often proffered.

Dellar concludes that when we’re teaching new vocabulary, we need to pay careful attention to “how well we’re priming students”. Limiting instruction and feedback to single ways of saying things, or short ungrammaticalised chunks / collocations gives students little chance of “really coming to terms with the ways in which new items are typically used with previously learned grammar and vocabulary, or the kinds of (often fairly limited) contexts in which items are used”.

Dellar finishes his blog post with this:

Any of you who ever have to deal with student writing as they prepare to do degrees or Master’s in English, where all the kinds of issues seen above are compounded with serious discoursal and structural issues, spelling problems, paragraphing anomalies, and so on will know what I mean when I claim that prevention is infinitely preferable to cure.

And that the medicine needed really isn’t all that much to do with grammar as we know it!


Let’s start with language errors made by L2 learners. Dellar ignores the work done by researchers on this subject.

We can begin with contrastive analysis research, notably Fries (1945), which suggested that errors are the result of transfer from the L1. Then came research in the 60s which showed that errors were not simply explained by L1 transfer; the same errors were commonly made by all language learners, regardless of their L1. Corder’s (1967) seminal work argued that errors were indications of learners’ attempts to figure out an underlying rule-governed system. Corder distinguished between errors and mistakes: mistakes are slips of the tongue and not systematic, whereas errors are indications of an as yet non-native-like, but nevertheless, systematic, rule-based grammar. Here, Corder is suggesting that learning an L2 is a cognitive process, not a mindless (sic – for behaviourists, the construct of mind is anathema) process of responding to stimulus from the environment), where learners work with their own ideas about the L2, which slowly approximate to a native speaker model. This “interlanguage development” theory received its first full expression in Selinker’s (1972) paper, which argues that L2 learners develop their own autonomous mental grammar with its own internal organising principles. Selinker uses the word “grammar”, as do all applied linguistic scholars, to refer to the system and structure of a language, concentrating on syntax, but including morphology, phonology and semantics.

Dellar claims that “the vast majority of mistakes really aren’t to do with grammar!”. He is, quite simply, wrong, as thousands of studies attest. Errors in the output of learners of English as an L2 are usually categorized in terms of lexical, grammatical, phrasing, and pragmatic errors, with punctuation added when looking at written texts. There is not a single study that I know of on this subject which doesn’t give grammatical errors as the most frequent type of error. Here’s an example.

MacDonald (2016) found from an examination of written texts in English of Spanish university students that grammar errors made up the majority of errors.

Dellar’s view of language learning

I’ve dealt with Dellar’s view of language learning in a separate post,  so let me focus here on his use of “priming” as an explanation of how people learn an L2. There are, at the moment, two rival, (and incompatible) views of second language acquisition (SLA). The first is that it’s a cognitive process involving the development of interlanguage, helped by innate knowledge of how language works. The second is that learning an L2 is the same as learning anything else, including the L1: it’s a learning process caused by responding to stimuli in the environment. This is a modern version of behaviorism and it’s motivated by a modern type of empiricism: language use emerges from social interaction, and only very basic statistical operations in the mind, based on the power law principle, are enough to explain how people learn an L2. These usage-based theories come in various forms and are referred to under the umbrella term “emergentism”. I think the best exponent of this view is Nick Ellis.

Priming is mostly associated with emergentist theories of SLA; it stresses frequency effects. But it’s complicated. How does priming occur? Is it unconscious? Is Schmidt’s “Noticing” construct compatible with the construct of priming? In his blog post, Dellar says:

What is really going on is that the new language is somehow slowly getting welded awkwardly onto the old

While that’s not what anybody who argues for an interlanguage development view of SLA would claim (new language doesn’t get “awkwardly welded onto the old”), it sounds as if Dellar is suggesting that learners do develop an increasingly sophisticated model of the target language. If he is, then this clashes with his insistence that priming is what explains SLA. Dellar says

The fact that students have encountered new items in class, seen them once or twice or even three times in some kind of context, possibly translated them and more or less grasped their meanings is simply evidence of the fact that they’ve not yet been primed anywhere near sufficiently. For fluent users who’ve grasped new items, there’s been encounter after encounter after encounter, with item and with co-text in context; for learners, this process has only just begun, and as a result the odds of priming from L1 being brought over when it comes to using the new items creatively is very high indeed.

First, pace Dellar, “encounter after encounter after encounter, with item and with co-text in context” is not a necessary condition for learning a language, as the daily inventive output of English users makes clear. Millions of times a day, fluent users of English as an L2 use combinations of items that they’ve NEVER encountered before, not even once. If Dellar wants to adopt a strictly “priming”, usage-based view of SLA, then he has to explain this. As Eubank and Gregg (2003) say:  “…. it is precisely because rules have a deductive structure that one can have instantaneous learning… With the English past tense rule, one can instantly determine the past tense form of “zoop” without any prior experience of that verb…….. If all we know is that John zoops wugs, then we know instantaneously that John zoops, that he might have zooped yesterday and may zoop tomorrow, that he is a wug-zooper who engages in wug-zooping, that whereas John zoops, two wug-zoopers zoop, that if he’s a Canadian wug-zooper he’s either a Canadian or a zooper of Canadian wugs (or both), etc.  We know all this without learning it, without even knowing what “wug” and “zoop” mean”.

Second, Dellar wants to explain failure to learn “new items” of the L2 by appeal to insufficient priming. But that is not how a great many scholars (including Eubank and Gregg, of course, and a legion of others) would explain it, and it’s not how those in the emergentist camp would explain it either. Dellar says that learning depends on priming, without explaining what priming refers to. Elsewhere, Dellar has said that he uses the construct “priming” to refer to lexical priming, not structural or syntactic primimg, and that he bases himself on Hoey’s 2005 book. Hoey says that priming amounts to this: “every time we use a word, and every time we encounter it anew, the experience either reinforces the priming by confirming an existing association between the word and its co-texts and contexts, or it weakens the priming, if we encounter a word in unfamiliar contexts” (Hoey, 2005).  Note that there is absolutely no way that such a statement can be tested by appeal to empirical evidence; Hoey’s theory is circular. Until the construct of “priming” is operationally defined in such a way that statements about it are open to empirical refutation, it remains a mysterious construct that people like Dellar can use as they want. Furthermore, Dellar fails to explain how his insistence that ELT should focus on the explicit teaching of lexical chunks can be reconciled with Hoey’s insistence that lexical primimg is a pscholinguistic phenomenon that refers to implcit, unconscious learning.


Which brings us to the third matter: Dellar’s approach to teaching. We get a glimpse of it when he talks of “the sheer futility” of explaining error “through the filter of grammar”. Using the example of a student who wrote

The area has been deserted after a huge flooding 3 years ago

he asks “What’s a dogged grammar hound to do here?” and proceeds to lampoon the advice such grammar hounds might offer. He concludes that their answer

The area has been deserted since huge flooding three years ago

“still sounds very stilted and forced”, and he suggests that the text needs “rather severe reworking”, no doubt so as to include some of his beloved lexical chunks. Well, The area has been deserted since huge flooding three years ago, sounds OK to me, and reading Dellar’s own work is enough to raise serious questions about his ability to judge the coherence and cohesion of written texts. In any case, I think most students would benefit more from the recast Dellar thinks the grammar hounds would arrive at, than from Dellar’s own feedback, as evidenced in the examples he provides. What, one wonders, is the effect on a student of that kind of feedback? How does such severe reworking get welded on to the student’s current model of English? Dellar pours scorn on conventional grammar teaching, but his attempts to incorporate his own “bottom-up grammar” into his lexical approach are bewildering – see this recording    

Dellar’s preoccupation with the importance of lexical chunks informs his view of ELT. “Don’t teach grammar, teach lexical chunks” is the message. Rather than appreciate the fact that language learning is essentially a matter of implicit learning, and that any type of synthetic syllabus, be it grammar based or lexical chunk based, is fatally flawed, Dellar insists, like nobody else in the commercial field of ELT, that explicit teaching (of lexical chunks in context) should drive language learning. He talks about the problems he had in his attempts to teach students 50 lexical chunks a week, but what did he learn? Not that it’s an impossible task to teach learners the tens of thousands of lexical chunks native speakers use, nor even that there are principled ways of reducing the number. No, all he learnt was that the lexical chunks need to be embedded in context.

Dellar’s lexical chunks, served up every few days on his website, now number well over 200. What informs inclusion in this motley collection? And how are they all to be sufficiently “primed” so as to form part of the learner’s procedural knowledge of English?

For a fuller assessment of Dellar’s views of ELT, see separate posts, here, and here,

Finally, what about Dellar’s conclusion?

Any of you who ever have to deal with student writing as they prepare to do degrees or Master’s in English, where all the kinds of issues seen above are compounded with serious discoursal and structural issues, spelling problems, paragraphing anomalies, and so on will know what I mean when I claim that prevention is infinitely preferable to cure.

And that the medicine needed really isn’t all that much to do with grammar as we know it!

But is prevention better than cure when it comes to ELT?  Should teachers strive to prevent their students from making mistakes, rather than helping them to learn from mistakes? And, in the unlikely event that you reply “Yes, they should”, then what’s the preventive medicine? Learning by heart fifty randomly selected lexical chunks, along with contexts, every week?



Corder, S. P. (1967). The Significance of Learners’ Errors. International Review of Applied Linguistics in Language Teaching, 5, 161-170.

Eubank, L., & Gregg, K. (2002). NEWS FLASH—HUME STILL DEAD. Studies in Second Language Acquisition, 24(2), 237-247.

Fries, C. C. (1945). Teaching and Learning English as a Foreign Language. Ann Arbor: University of Michigan Press.

Hoey, M. (2005) Lexical Priming: A New Theory of Words and Language. Oxford: OUP.

MacDonald, P. (2016) “We All Make Mistakes!”. Analysing an Error-coded Corpus of Spanish University Students’ Written English, in Complutense Journal of English Studies, 24, 103-129.

Selinker, L. (1972). Interlanguage. International Review of Applied Linguistics in Language Teaching, 10, 209-241.


Radical ELT Part One

Recently I appealed for help in writing the final chapter of a book Mike Long and I are doing on ELT. The chapter is called Radical ELT: Signs of struggle: Towards an alternative organization of ELT and I’d like to thank all those who have been in touch. I’ve had replies from lots of radicals, all doing great things to challenge the interlocking publishing, teaching, teacher-training, and testing hydra that makes up the current $200 billion ELT industry, an industry whose prime motivation, profit, leads inevitably to the commodification of education, with disastrous consequences for almost everybody concerned. Woops! I should have said, perhaps, that they’re all making significant contributions to on-going attempts to change ELT practice in such a way that students and teachers benefit.

In this post, the first of a series dedicated to radicals working in ELT, I’d like to highlight the work being done by Nick Bilbrough.

The Hands Up Project  (Click the link to go their website)

Nick Bilbrough is the founder and main mover of this project, which aims to help kids in Gaza and the Occupied West Bank learn English. Five years ago, using simple video conferencing tools, he started connecting online to a small group of children in a library in Beit Hanoun, Gaza for weekly storytelling sessions. Now, to quote from the website, “the Hands Up Project works with over thirty different groups in Gaza. More than 500 kids a week now connect to volunteers around the world who work in collaboration with the local teacher to tell stories to each other, to play games and to do other activities to help them bring the English that the children are learning come to life”.

Just last week, Nick organized an online session where the winners of the “Toothbrush and other plays” competition were announced. I join the 100+ people online for the event, and I have to say it was incredibly moving to watch so many kids from all over the world taking part, all of them doing their bit to support their friends in Gaza and the occupied West Bank. The authors and actors all had their say; kids from Brazil did a play written by kids in Gaza; the solidarity and human warmth of everybody involved was truly inspirational.

By far the most important thing about the Hands Up project is its brave political stance, its support for Palestinians and all those who are marginalized by the policies of the Israeli government. We should all speak out against the long-standing abuses of human rights, the illegal expansion of territory, and the apartheid policies of the Israeli government which are, shamefully, condoned by the US and UK governments, among so many others.

When I talked to Nick on the phone, he said he agreed with Scott Thornbury’s views of ELT, his dismissal of coursebooks, his emphasis on communicative practice. (Scott, by the way, is a trustee of the Hands Up project and has done a lot of work for them, “Invaluable! Nobody else could have done it”, Nick said.) I asked him why he called himself a radical. “Because I’m trying to give a voice to those without a voice” he said. “Empowerment” was a word he used a lot. And he didn’t know a better way of empowering learners than by storytelling and putting on plays.

Now, I don’t want to claim Nick as a trophy, signed-up supporter of TBLT (although I reckon he’d be pleased to be counted as a signed-up supporter of Dogme), but it’s important to note that Nick and all those working on the Hands Up project reject current ELT practice. They care little for the CEFR, they care less for a PPP approach, and they care absolutely nothing for coursebooks. They DO English. They involve their students in storytelling, in shooting the breeze, and in the collaborative work of writing and putting on plays. That’s the focus of their work, of their classes, which together make up a coherent, exciting, alternative syllabus (sic). It would now be possible for Nick, a highly qualified and experienced teacher of English as an L2, with lots of potential commercial backers, to organize more conventional English courses, using coursebooks with all their bells and whistles, gladly donated by a savvy publisher. But Nick’s a radical, and so are all those in his growing team.

Can online teaching be a force for change?

Covid 19 has forced teachers of English as an L2 to switch to online platforms, Zoom being the most popular. One of the results has been a lot of discussion among teachers on social media about how best to adapt their practice to the new environment. Not surprisingly, most of the discussion is about technical issues, about the mechanics of how “usual”, “normal” classroom-based teaching practices can best be transferred to Zoom sessions. But I find it encouraging to see that a significant part of the discussion is about frustration at the unsatisfactory level and quality of student participation.  And from that, I dare to suggest, as an unintended consequence, come questions about the efficacy of coursebook-driven ELT.

In a normal classroom session, the teacher is in the same place with a group of students, and the fact that the teacher talks most of the time; leads the students through a set of activities which mostly involve them in working, often with their heads in the coursebook, on bits of the language; and gets them to engage in real communication among themselves for very short periods of time, goes unnoticed and unremarked. It’s normal! Still, everybody’s together, there’s often a good, shared atmosphere, and the skillful teacher moves around the students, checking and encouraging, making the classroom session friendly, purposeful, and well-structured.

But the online version of the same session is more likely to fall flat, and the lack of real communication among the group is thrown into stark relief. Typically, it’s the “production” part of the PPP methodology in online classes that doesn’t work, and, I suggest, that’s hardly surprising. If you use coursebooks in an online environment, their basic focus on talking ABOUT the language, of studying the language as an object, is magnified. Students perhaps feel more keenly that they’re here to study the language, to be told stuff, to learn that particular bit of the book.

The alternative is to use the online environment to talk IN the language and to organize the classes so that genuine communication among the group is the predominant ingredient of each session. Let’s suppose that the class is about job interviews. In a coursebook, this is, let’s say, Unit 3. The “Lead In” activity might be

“Have you ever been for a job interview? In pairs, talk about: What job was it? Who interviewed you? What happened?”.

The problem is, that activity is one of ten, it’s allotted 10 minutes, after which the REAL FOCUS of the lesson – selected bits of vocabulary and a grammar point (perhaps the present perfect) – is then developed through a series of activities, most of which involve students studying the language. How do you organize that on Zoom? Well, you use break out groups for group discussion, but that takes up a small proportion of the total time, and it’s rightly perceived as peripheral to the “real” job of learning.

In contrast, in a TBLT course, a series of lessons deals with job interviews, if this is identified as a need for those doing the course. In the first lesson, we concentrate on relevant input, and simple productive tasks. In subsequent lessons, we get students to talk to each other about various parts of a job interview, slowly leading towards getting them to do a simulation of such an interview. Every lesson is organized around their using the language, talking to each other, where the teacher gives help with vocabulary and grammar reactively, when it’s needed.

In a Dogme course, if the students expressed an interest in job interviews, the focus of the lesson would be their discussion of the topic. There would be no pre-planned focus on particular grammar points or other formal aspects of the language, and MOST of classroom time would be devoted to communicative activities.

If you emphasise learning by doing, as you do in TBLT and Dogme approaches, then you prioritise student participation, and you make it clear that that’s what you expect students to do. As a result, the online Zoom sessions are much more likely to be perceived by students as events where talking to each other is the main point.

There is no doubt that interest in alternatives to coursebook-driven ELT has grown dramatically as a result of the rise in online teaching, which has inspired teachers to take a fresh look at what they’re doing. Why is teaching English online with a coursebook such a drag? Because the vital, ameliorating effects of teachers working their magic in a classroom can’t rescue it.

On the other hand, TBLT, where tasks, not linguistic items, are the units of analysis for syllabus design, lends itself to online teaching, because tasks naturally involve students more – they demand active student participation, as they do in the classroom too, of course. Tasks are the natural organizing principle that Zoom is based on. Zoom is not the obvious home of a PPP methodology and the mentality that it encourages. Tasks can be organized on Zoom in such a way that they naturally lead to the kinds of interaction required for language learning – learning by doing.

Likewise, Dogme, in its rejection of coursebooks, its brave, inspirational insistence on the core values of communicative language teaching, offers an enticing alternative to Zoom sessions devoted to teaching “McNuggets”.

So, when we discuss online teaching, let’s not just discuss adapting what we do to an online platform. Let’s discuss radical change. Change from global, commodified, coursebook-driven ELT to local responses to local needs. There’s no doubt that Dogme and TBLT are leading the way, and I hope more teachers will join the rising numbers informing themselves about these exciting alternatives which might just, at last, threaten the thirty year old hegemony of coursebooks and the related paraphernalia of the CEFR, high stakes proficiency exams, CELTA, and all that and all that.

An Appeal for help

Mike Long and I are writing a book on ELT. The final chapter is Radical ELT: Signs of struggle: Towards an alternative organization of ELT.

I would be grateful for suggestions on what to include in the chapter. We want to highlight the work of any group or individual fighting the current ELT industry, working towards a fairer, more efficacious way of organizing ELT. Please would anybody involved in alternative approaches to syllabuses, materials, testing and teacher education get in touch. The Hands Up project, Mosaic Education, cooperatives like SLB, are just three examples of what we consider to be promising, progressive initiatives.

Please use the Comments section.