Theoretical Constructs in SLA

Given the recent discussion on Twitter of Native Speakers, where I defended the construct as a useful one for SLA research, below is a short piece I wrote for Robinson, P. (ed) (2013) The Encyclopedia of SLA London, Routledge. In the Twitter thread I wrote, I criticised those in the sociolingustics field who adopt a relativist epistemology. The post I did on the work of Adrian Holliday might be of interest.

1. Introduction
Theoretical constructs in SLA include such terms as interlanguage, variable competence, motivation, and noticing. These constructs are used in the service of theories which attempt to explain phenomena, and thus, in order to understand how the term “theoretical construct” is used in SLA, we must first understand the terms “theory” and “phenomena”.

A theory is an attempt to provide an explanation to a question, usually a “Why” or “How” question. The “Critical Period” theory (see Birdsong, 1999) attempts to answer the question “Why do most L2 learners not achieve native-like competence?” The Processability Theory (Pienemann, 1998) attempts to answer the question “How do L2 learners go through stages of development?” In posing the question that a theory seeks to answer, we refer to “phenomena”: the things that we isolate, define, and then attempt to explain in our theory. In the case of theories of SLA, key phenomena are transfer, staged development, systemacity, variability and incompleteness. (See Towell and Hawkins, 1994: 15.)

A clear distinction must be made between phenomena and observational data. Theories attempt to explain phenomena, and observational data are used to support and test those theories. The important difference between data and phenomena is that the phenomena are what we want to explain, and thus, they are seen as the result of the interaction between some manageably small number of causal factors, instances of which can be found in different situations. By contrast, any type of causal factor can play a part in the production of data, and the characteristics of these data depend on the peculiarities of the experimental design, or data-gathering procedures, employed. As Bogen and Woodward put it: “Data are idiosyncratic to particular experimental contexts, and typically cannot occur outside those contexts, whereas phenomena have stable, repeatable characteristics which will be detectable by means of different procedures, which may yield quite different kinds of data” (Bogen and Woodward, 1988: 317). A failure to appreciate this distinction often leads to poorly-defined theoretical constructs, as we shall see below.

While researchers in some fields deal with such observable phenomena as bones, tides, and sun spots, others deal with non-observable phenomena such as love, genes, hallucinations, gravity and language competence. Non-observable phenomena have to be studied indirectly, which is where theoretical constructs come in. First we name the non-observable phenomena, we give them labels and then we make constructs. With regard to the non-observable phenomena listed above (love, genes, hallucinations, gravity and language competence), examples of constructs are romantic love, hereditary genes, schizophrenia, the bends, and the Language Acquisition Device. Thus, theoretical constructs are one remove from the original labelling, and they are, as their name implies, packed full of theory; they are, that is, proto-typical theories in themselves, a further invention of ours, an invention made in our attempt to pin down the non-observable phenomena that we want to examine so that the theories which they embody can be scrutinised. It should also be noted that there is a certain ambiguity in the terms “theoretical construct” and “phenomenon”. The “two-step” process of naming a phenomenon and then a construct outlined above is not always so clear: for Chomsky (Chomsky, 1986), “linguistic competence” is the phenomenon he wants to explain, to many it has all the hallmarks of a theoretical construct.

Constructs are not the same as definitions; while a definition attempts to clearly distinguish the thing defined from everything else, a construct attempts to lay the ground for an explanation. Thus, for example, while a dictionary defines motivation in such a way that motivation is distinguishable from desire or compulsion, Gardener (1985) attempts to explain why some learners do better than others, and he uses the construct of motivation to do so, in such a way that his construct takes on its own meaning, and allows others in the field to test the claims he makes. A construct defines something in a special way: it is a term used in an attempt to solve a problem, indeed, it is often a term that in itself suggests the answer to the problem. Constructs can be everyday parlance (like “noticing” and “competence”) and they can also be new words (like “interlanguage”), but, in all cases, constructs are “theory-laden” to the maximum: their job is to support a hypothesis, or, better still, a full-blown theory. In short, then, the job of a construct is to help define and then solve a problem.

2. Criteria for assessing theoretical constructs used in theories of SLA

There is a lively debate among scholars about the best way to study and understand the various phenomena associated with SLA. Those in the rationalist camp insist that an external world exists independently of our perceptions of it, and that it is possible to study different phenomena in this world, to make meaningful statements about them, and to improve our knowledge of them by appeal to logic and empirical observation. Those in the relativist camp claim that there are a multiplicity of realities, all of which are social constructs. Science, for the relativists, is just one type of social construction, a particular kind of language game which has no more claim to objective truth than any other. This article rejects the relativist view and, based largely on Popper’s “Critical Rationalist” approach (Popper, 1972), takes the view that the various current theories of SLA, and the theoretical constructs embedded in them, are not all equally valid, but rather, that they can be critically assessed by using the following criteria (adapted from Jordan, 2004):

1. Theories should be coherent, cohesive, expressed in the clearest possible terms, and consistent. There should be no internal contradictions in theories, and no circularity due to badly-defined terms.
2. Theories should have empirical content. Having empirical content means that the propositions and hypotheses proposed in a theory should be expressed in such a way that they are capable of being subjected to tests, based on evidence observable by the senses, which support or refute them. These tests should be capable of replication, as a way of ensuring the empirical nature of the evidence and the validity of the research methods employed. For example, the claim “Students hate maths because maths is difficult” has empirical content only when the terms “students”, “maths”, “hate” and “difficult” are defined in such a way that the claim can be tested by appeal to observable facts. The operational definition of terms, and crucially, of theoretical constructs, is the best way of ensuring that hypotheses and theories have empirical content.
3. Theories should be fruitful. “Fruitful” in Kuhn’s sense (see Kuhn, 1962:148): they should make daring and surprising predictions, and solve persistent problems in their domain.

Note that the theory-laden nature of constructs is no argument for a relativist approach: we invent constructs, as we invent theories, but we invent them, precisely, in a way that allows them to be subjected to empirical tests. The constructs can be anything we like: in order to explain a given problem, we are free to make any claim we like, in any terms we choose, but the litmus test is the clarity and testability of these claims and the terms we use to make them. Given it’s pivotal status, a theoretical construct should be stated in such a way that we all know unequivocally what is being talked about, and it should be defined in such a way that it lays itself open to principled investigation, empirical and otherwise. In the rest of this article, a number of theoretical constructs will be examined and evaluated in terms of the criteria outlined above.

3. Krashen’s Monitor Model

The Monitor Model (see Krashen, 1985) is described elsewhere, so let us here concentrate on the deficiencies of the theoretical constructs employed. In brief, Krashen’s constructs fail to meet the requirements of the first two criteria listed above: Krashen’s use of key theoretical constructs such as “acquisition and learning”, and “subconscious and conscious” is vague, confusing, and, not always consistent. More fundamentally, we never find out what exactly “comprehensible input”, the key theoretical construct in the model, means. Furthermore, in conflict with the second criterion listed above, there is no way of subjecting the set of hypotheses that Krashen proposes to empirical tests. The Acquisition-Learning hypothesis gives no evidence to support the claim that two distinct systems exist, nor any means of determining whether they are, or are not, separate. Similarly, there is no way of testing the Monitor hypothesis: since the Monitor is nowhere properly defined as an operational construct, there is no way to determine whether the Monitor is in operation or not, and it is thus impossible to determine the validity of the extremely strong claims made for it. The Input Hypothesis is equally mysterious and incapable of being tested: the levels of knowledge are nowhere defined and so it is impossible to know whether i + 1 is present in input, and, if it is, whether or not the learner moves on to the next level as a result. Thus, the first three hypotheses (Acquisition-Learning, the Monitor, and Natural Order) make up a circular and vacuous argument: the Monitor accounts for discrepancies in the natural order, the learning-acquisition distinction justifies the use of the Monitor, and so on.

In summary, Krashen’s key theoretical constructs are ill-defined, and circular, so that the set is incoherent. This incoherence means that Krashen’s theory has such serious faults that it is not really a theory at all. While Krashen’s work may be seen as satisfying the third criterion on our list, and while it is extremely popular among EFL/ESL teachers (even among those who, in their daily practice, ignore Krashen’s clear implication that grammar teaching is largely a waste of time) the fact remains that his series of hypotheses are built on sand. A much better example of a theoretical construct put to good use is Schmidt’s Noticing, which we will now examine.

4. Schmidt’s Noticing Hypothesis

Schmidt’s Noticing hypothesis (see Schmidt, 1990) is described elsewhere. Essentially, Schmidt attempts to do away with the “terminological vagueness” of the term “consciousness” by examining three senses of the term: consciousness as awareness, consciousness as intention, and consciousness as knowledge. Consciousness and awareness are often equated, but Schmidt distinguishes between three levels: Perception, Noticing and Understanding. The second level, Noticing, is the key to Schmidt’s eventual hypothesis. The importance of Schmidt’s work is that it clarifies the confusion surrounding the use of many terms used in psycholinguistics (not least Krashen’s “acquisition/ learning” dichotomy) and, furthermore, it develops one crucial part of a general processing theory of the development of interlanguage grammar.

Our second evaluation criterion requires that theoretical constructs are defined in such a way as to ensure that hypotheses have empirical content, and thus we must ask: what does Schmidt’s concept of noticing exactly refers to, and how can we be sure when it is, and is not being used by L2 learners? In his 1990 paper, Schmidt claims that noticing can be operationally defined as “the availability for verbal report”, “subject to various conditions”. He adds that these conditions are discussed at length in the verbal report literature, but he does not discuss the issue of operationalisation any further. Schmidt’s 2001 paper gives various sources of evidence of noticing, and points out their limitations. These sources include learner production (but how do we identify what has been noticed?), learner reports in diaries (but diaries span months, while cognitive processing of L2 input takes place in seconds and making diaries requires not just noticing but also reflexive self-awareness), and think-aloud protocols (but we cannot assume that the protocols identify all the examples of target features that were noticed).

Schmidt argues that the best test of noticing is that proposed by Cheesman and Merikle (1986), who distinguish between the objective and subjective thresholds of perception. The clearest evidence that something has exceeded the subjective threshold and been noticed is a concurrent verbal report, since nothing can be verbally reported other than the current contents of awareness. Schmidt adds that “after the fact recall” is also good evidence that something was noticed, providing that prior knowledge and guessing can be controlled. For example, if beginner level students of Spanish are presented with a series of Spanish utterances containing unfamiliar verb forms, and are then asked to recall immediately afterwards the forms that occurred in each utterance, and can do so, that is good evidence that they noticed them. On the other hand, it is not safe to assume that failure to do so means that they did not notice. It seems that it is easier to confirm that a particular form has not been noticed than that it has: failure to achieve above-chance performance in a forced-choice recognition test is a much better indication that the subjective threshold has not been exceeded and that noticing did not take place.

Schmidt goes on to claim that the noticing hypothesis could be falsified by demonstrating the existence of subliminal learning, either by showing positive priming of unattended and unnoticed novel stimuli, or by showing learning in dual task studies in which central processing capacity is exhausted by the primary task. The problem in this case is that, in positive priming studies, one can never really be sure that subjects did not allocate any attention to what they could not later report, and similarly, in dual task experiments, one cannot be sure that no attention is devoted to the secondary task. In conclusion, it seems that Schmidt’s noticing hypothesis rests on a construct that still has difficulty measuring up to the second criteria of our list; it is by no means easy to properly identify when noticing has and has not occurred. Despite this limitation, however, Schmidt’s hypothesis is still a good example of the type of approach recommended by the list. Its strongest virtues are its rigour and its fruitfulness, Schmidt argues that attention as a psychological construct refers to a variety of mechanisms or subsystems (including alertness, orientation, detection within selective attention, facilitation, and inhibition) which control information processing and behaviour when existing skills and routines are inadequate. Hence, learning in the sense of establishing new or modified knowledge, memory, skills and routines is “largely, perhaps exclusively a side effect of attended processing”. (Schmidt, 2001: 25). This is a daring and surprising claim, with similar predictive ability, and it contradicts Krashen’s claim that conscious learning is of extremely limited use.

5. Variationist approaches

An account of these approaches is given elsewhere In brief, variable competence, or variationist, approaches, use the key theoretical construct of “variable competence”, or, as Tarone calls it, “capability”. Tarone (1988) argues that “capability” underlies performance, and that this capability consists of heterogeneous “knowledge” which varies according to various factors. Thus, there is no homogenous competence underlying performance but a variable “capacity” which underlies specific instances of language performance. Ellis (1987) uses the construct of “variable rules” to explain the observed variability of L2 learners’ performance: learners, by successively noticing forms in the input which are in conflict with the original representation of a grammatical rule acquire more and more versions of the original rule. This leads to either “free variation” (where forms alternate in all environments at random) or “systematic variation” where one variant appears regularly in one linguistic context, and another variant in another context.

The root of the problem of the variable competence model is the weakness of its theoretical constructs. The underlying “variable competence” construct used by Tarone and Ellis is nowhere clearly defined, and is, in fact, simply asserted to “explain” a certain amount of learner behaviour. As Gregg (1992: 368) argues, Tarone and Ellis offer a description of language use and behaviour, which they confuse with an explanation of the acquisition of grammatical knowledge. By abandoning the idea of a homogenous underlying competence, Gregg says, we are stuck at the surface level of the performance data, and, consequently, any research project can only deal with the data in terms of the particular situation it encounters, describing the conditions under which the experiment took place. The positing of any variable rule at work would need to be followed up by an endless number of further research projects looking at different situations in which the rule is said to operate, each of which is condemned to uniqueness, no generalisation about some underlying cause being possible.

At the centre of the variable competence model are variable rules. Gregg argues cogently that such variability cannot become a theoretical construct used in attempts to explain how people acquire linguistic knowledge. In order to turn the idea of variable rules from an analytical tool into a theoretical construct, Tarone and Ellis would have to grant psychological reality to the variable rules (which in principle they seem to do, although no example of a variable rule is given) and then explain how these rules are internalised, so as to become part of the L2 learner’s grammatical knowledge of the target language (which they fail to do). The variable competence model, according to Gregg, confuses descriptions of the varying use of forms with an explanation of the acquisition of linguistic knowledge. The forms (and their variations) which L2 learners produce are not, indeed cannot be, direct evidence of any underlying competence – or capacity. By erasing the distinction between competence and performance “the variabilist is committed to the unprincipled collection of an uncontrolled mass of data” (Gregg 1990: 378).

As we have seen, a theory must explain phenomena, not describe data. In contradiction to this, and to criteria 1and 2 in our list, the arguments of Ellis and Tarone are confused and circular; in the end what Ellis and Tarone are actually doing is gathering data without having properly formulated the problem they are trying to solve, i.e. without having defined the phenomenon they wish to explain. Ellis claims that his theory constitutes an “ethnographic, descriptive” approach to SLA theory construction, but he does not answer the question: How does one go from studying the everyday rituals and practices of a particular group of second language learners through descriptions of their behaviour to a theory that offers a general explanation for some identified phenomenon concerning the behaviour of L2 learners?

Variable Competence theories exemplify what happens when the distinction between phenomena, data and theoretical constructs is confused. In contrast, Chomsky’s UG theory, despite its shifting ground and its contentious connection to SLA, is probably the best example of a theory where these distinctions are crystal clear. For Chomsky, “competence” refers to underlying linguistic (grammatical) knowledge, and “performance” refers to the actual day to day use of language, which is influenced by an enormous variety of factors, including limitations of memory, stress, tiredness, etc. Chomsky argues that while performance data is important, it is not the object of study (it is, precisely, the data): linguistic competence is the phenomenon that he wants to examine. Chomsky’s distinction between performance and competence exactly fits his theory of language and first language acquisition: competence is a well-defined phenomenon which is explained by appeal to the theoretical construct of the Language Acquisition Device. Chomsky describes the rules that make up linguistic competence and then invites other researchers to subject the theory that all languages obey these rules to further empirical tests.

6. Aptitude

Why is anybody good at anything? Well, they have an aptitude for it: they’re “natural” piano players, or carpenters, or whatever. This is obviously no explanation at all, although, of course, it contains a beguiling element of truth.To say that SLA is (partly) explained by an aptitude for learning a second language is to beg the question: What is aptitude for SLA? Attempts to explain the role of aptitude in SLA illustrate the difficulty of “pinning down” the phenomenon that we seek to explain. If aptitude is to be claimed as a causal factor that helps to explain SLA, then aptitude must be defined in such a way that it can be identified in L2 learners and then related to their performance.

Robinson (2007) uses aptitude as a construct that is composed of different cognitive abilities. His “Aptitude Complex Hypothesis” claims that different classroom settings draw on certain combinations of cognitive abilities, and that, depending on the classroom activities, students with certain cognitive abilities will do better than others.. Robinson adds the “Ability Differentiation Hypothesis” which claims that some L2 learners have different abilities than others, and that it is important to match these learners to instructional conditions which favor their strengths in aptitude complexes. In terms of classroom practice, these hypotheses might well be fruitful, but they do not address the question of how aptitude explains SLA.

One example of identifying aptitude in L2 learners is the CANAL-F theory of foreign language aptitude, which grounds aptitude in “the triarchic theory of human intelligence” and argues that “one of the central abilities required in FL acquisition is the ability to cope with novelty and ambiguity” (Grigorenko, Sternberg and Ehrman, 2000: 392). However successfully the test might predict learner’s ability, the theory fails to explain aptitude in any causal way. The theory of human intelligence that the CANAL-F theory is grounded in fails to illuminate the description given of FL ability; we do not get beyond a limiting of the domain in which the general ability to cope with novelty and ambiguity operates. The individual differences between foreign language learners’ ability is explained by suggesting that some are better at coping with novelty and ambiguity than others. Thus, whatever construct validity might be claimed for CANAL-F, and however well the test might predict ability, it leaves the question of what precisely aptitude at foreign language learning is, and how it contributes to SLA, unanswered.

How, then, can aptitude explain differential success in a causal way? Even if aptitude can be properly defined and measured without falling into the familiar trap of being circular (those who do well at language aptitude tests have an aptitude for language learning), how can we step outside the reference of aptitude and establish more than a simple correlation? What is needed is a theoretical construct.

7. Conclusion

The history of science throws up many examples of theories that began without any adequate description of what was being explained. Darwin’s theory of evolution by natural selection (the young born to any species compete for survival, and those young that survive to reproduce tend to embody favourable natural variations which are passed on by heredity) lacked any formal description of the theoretical construct “variation”, or any explanation of the origin of variations, or how they passed between generations. It was not until Mendel’s theories and the birth of modern genetics in the early 20th century that this deficiency was dealt with. But, and here is the point, dealt with it was: we now have constructs that pin down what “variation” refers to in the Darwinian theory, and the theory is stronger for them (i.e. more testable). Theories progress by defining their terms more clearly and by making their predictions more open to empirical testing.

Theoretical constructs lie at the heart of attempts to explain the phenomena of SLA. Observation must be in the service of theory: we do not start with data, we start with clearly-defined phenomena and theoretical constructs that help us articulate the solution to a problem, and we then use empirical data to test that tentative solution. Those working in the field of psycholinguistics are making progress thanks to their reliance on a rationalist methodology which gives priority to the need for clarity and empirical content. If sociolinguistics is to offer better explanations, the terms used to describe social factors must be defined in such a way that it becomes possible to do empirically-based studies that confirm or challenge those explanations. All those who attempt to explain SLA must make their theoretical constructs clear, and improve their definitions and research methodology in order to better pin down the slippery concepts that they work with.


Birdsong, D. (ed) (1999) Second Language Acquisition and the Critical Period Hypothesis. Mahwah, NJ: Lawrence Erlbaum Associates.Bogen, J. and Woodward, J. (1988) Saving the phenomena. Philosophical Review 97: 303-52.

Chomsky, N. (1986) Knowledge of Language: Its Nature, Origin and Use. New York: Prager.

Ellis, R. (1987) Interlanguage variability in narrative discourse: style-shifting in the use of the past tense. Studies in Second Language Acquisition 9, 1-20.

Gardner, R. C. (1985) Social psychology and second language learning: the role of attitudes and motivation. London: Edward Arnold.

Gregg, K. R. (1990) The Variable Competence Model of second language acquisition and why it isn’t. Applied Linguistics 11, 1. 364—83.

Grigorenko, E., Sternberg, R., and Ehrman, M. (2000) “A Theory-Based Approach to the Measurement of Foreign Language Learning Ablity: The Canal-F Theory and Test.” The Modern Language Journal 84, iii, 390-405.

Jordan, G. (2004) Theory Construction in SLA. Benjamins: Amsterdam

Kuhn, T. (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Krashen, S. (1985) The Input Hypothesis: Issues and Implications. New York: Longman.

Pienemann, M. (1998) Language Processing and Second Language Development: Processability Theory. Amsterdam: John Benjamins.

Popper, K. R. (1972) Objective Knowledge. Oxford: Oxford University Press.

Schmidt, R. (1990) The role of consciousness in second language learning. Applied Linguistics 11, 129-58

Schmidt, R. (2001) Attention. In Robinson, P. (ed.) Cognition and Second Language Instruction. Cambridge: Cambridge University Press, 3-32.

Tarone, E. (1988) Variation in interlanguage. London: Edward Arnold.

Towell, R. and Hawkins, R. (1994) Approaches to second language acquisition. Clevedon: Multilingual Matters.

Summer Reading

Frustrated by online searches for new books to read in the garden (no way am I going near a Spanish beach this summer!) I looked through my own collection and picked the ones listed below. I’m sure many of you will have read some of them, but I hope one or two might tickle your fancy.

The Doll Factory is set in London, 1850, the year of the Great Exhibition. This is Victorian London, a truly awful place for most of its inhabitants. It tells the story of Iris, struggling to survive, an aspiring artist who meets a member of the Pre-Raphaelite group and agrees to model for him. Silas, a Dickensian bad guy if ever there was one, is the spanner in the works. It’s a great story and it’s beautifully written. Paula Hawkins, author of The Girl on the Train, describes it as “A sharp, scary, gorgeously evocative tale of love, art and delusion”. I bought it at an airport without much expectation and was blown away by the opening pages that draw you in to a wonderfully described environment and its main characters. A superb debut novel.     

There are five novels in Edward St. Auybyn’s Patrick Melrose series. I read the first one, Never Mind, when it came out and just couldn’t believe that anyone could write so well – it’s a masterpiece. Nobody else writes like this, St. Aubyn has to be one of the greatest stylists in the English language. St. Aubyn says that he wrote the novels as an act of investigative self-repair. The first novel tells of how, as a child, he was repeatedly sexually abused by his father while his mother turned a blind eye. This harrowing tale is told with quite extraordinary style; it is, IMHO, an unrivalled tour de force of literary elegance, sparkling in its wit and intelligence. The rest of the series recount his father’s death, his loss of the huge family fortunes, and his eventual “redemption”. Hide the drugs while you read; if you felt hungry reading One Day in the Life of Ivan Denisovich, you’ll be sorely tempted to indulge in illicit substances while reading this. The books were recently made into a tv series starring Benedict Cumberbatch as Patrick. Fantastic as Cumberbatch’s acting is, you really must read the books.

Gomptertz’ What Are You Looking At? gives a tremendously enjoyable history of modern art. It’s easy to read, mercifully free of all the precious, obscurantist stuff that art critics are famous for, and it tells its story with terrific anecdotes and illustrations. Highly recommended.

Russell’s Bird lives is the best biography I’ve ever read (Mao: The Unknown Story by Jung Chang, and Mozart: A Life by Peter Gay, come close). It’s the biography of Charlie Parker and if you like jazz, you’ve almost certainly read it. It swings! Here’s one review:

“One of the very few jazz books that deserve to be called literature . . . perhaps the finest writing on jazz to be found anywhere. . . . Russell knows a lot about music, has a novelist’s eye for detail and a phonographic ear for jazz speech, and he swings a clean sports reporter’s style. He has poured these gifts into what must be the most exhaustively researched biography on a jazz musician ever published and miraculously catches the feel of a jazz performance, that impossible fusion of spontaneous freedom and total discipline. Those aware of Parker’s genius cannot do without this book.”  Grover Sales, Saturday Review

Greger’s How not to die is a must read. As someone who’s dedicated much of his life to drug abuse (not that I’ve ever got close to Edward St. Auybyn’s), I’m an unlikely fan of a book dedicated to clean living. But Greger doesn’t preach or tell you off – he just gives you facts about the damage modern meat and dairy produce do to us, and recommends that we adopt a vegan diet. The argument is compelling. Quite apart from how much we suffer, it’s evident that the planet we live on suffers even more than we do from our reliance on meat and dairy products. In our house now, we eat a small fraction of the meat and dairy stuff we used to: most of what we eat is unprocessed fruit and veg. “Stay away from processed foods” is the number one take away (geddit!). Yes, I know: beer, wine, vodka, cocaine, heroin and speed are all processed. I want to change to opium, but you wouldn’t believe how hard it is to find.    

Stevick’s A Way and Ways is well described and commented on by Scott Thornbury in his S is for (Earl) Stevick post. The book looks at various innovative ways of doing ELT that were emerging at that time. It changed my life. It didn’t stop the drug abuse, but it stopped me teaching in the prescribed fashion that dominated ELT in the late 1970s and that now, in the 2020s, again dominates. Stevick was a key player in breaking the mold and ushering in a golden, alas, short-lived, era of CLT.  My good friend Mike Long (how we all miss him) used to get quickly riled up when I praised Stevick, and I now appreciate his concerns, but nevertheless, back in the early 1980s, Stevick was one of my gurus (Henry Widdowson – another of Mike Long’s bete noires – was another). When I worked at ESADE Idiomas in Barcelona, Earl was a regular visitor and I fondly remember hosting a lunch for Earl at our house in 1989. About a dozen of us sat around the table outside, shooting the breeze with one of the most charming and persuasive educationalists we’d ever meet. I’m going to spoil this now, but I can’t resist recounting what Earl said at that lunch. “Being quizzed by Geoff is like going to the dentist – it hurts, but it’s good for you”. Well, it got a laugh, and added a bit of spice, as if it were needed, to a lovely encounter with the great man. It’s never too late to read Stevick’s stuff for yourself. A Way and Ways is on my desk now, ready for its umpteenth reading this summer.       

Finally, Coffield and Williamson’s From Exam Factories to Communities of Discovery which I only bought recently, makes compelling reading. To paraphrase the blurb, it calls for educators to challenge the dominant market-led model of education and instead build a more democratic one, better able to face threats such as environmental damage; intensified global competition; corrosive social inequalities in and between nations in the world; and the need for a new, just and sustainable economic model. It shows how education policy has led to schools and universities becoming exam factories and further education colleges becoming skills factories. They propose an alternative future for education, which builds “communities of discovery” by realising the collective creativity of students and educators through democracy. Put it down from time to time, but its 80 pages are well worth slowly digesting.   

I wish you all a great summer break; stay safe and happy reading.

The books

Macneal. E. (2019) The Doll Factory. Picador.

St. Aubyn. E. (2016) The Melrose novels (the 5 novels in one book).  Picador.

Gompertz, W. (2012) What are you looking at? Penguin.

Russell, R. (1972) Bird Lives. Quartet Books.

Greger, M. (2016) How Not to Die. Macmillan.

Coffield, F. & Williamson, B. (2012) From Exam Factories to Communities of Discovery. Bedford Way Papers.


Dr. Gianfranco Conti (with largely unacknowledged help from his side kick Steve Smith) is the brilliant scholar and gifted educator responsible for the M.A.R.S.’ E.A.R.S. method for teaching modern languages. The enormous success of Dr. Conti’s method is due to a winning combination of factors:

  • its crystal clear, lock-step, Do-this-then-do-that-and-don’t-even think-of-doing-anything-else methodology;
  • its catchy, easy to follow range of classroom activities, including Drill and Kill, Stultifying Sentence Stealer, Disappearing Time, Wake Me Up When It’s Over, Three Blind Mimes, and It’s All Nonsense.
  • the relentless promotion of both Dr. Conti himself and his methodology on platforms including YouTube, Facebook, and websites like The Language Gym.

Dr. Conti’s websites give tons (sic) of extraordinarily detailed information about every single aspect of his sparkling career, with special attention paid to his formidable combination of academic prowess and pedagogic acumen. Thousands of teachers applaud his work; stories abound of fans waiting patiently outside his house, often in pouring rain, hoping to catch a glimpse of their hero as he returns home after a gruelling day’s work at the chalk face. If all the “Thank you, Dr. Conti” testimionials Dr. Conti has received over the years were laid end to end, they would doubtless circumnavigate his wonderous head more than once.

Dr. Conti discusses his MARS-EARS sequence for implementing his Endlessly Repeated Instruction (ERI) method of teaching modern foreign languages (MFLs) in a number of posts, one of which is Patterns First. While the method remains faithful to the time-honoured tradition in mainstream language education of ignoring the tedious distinction made by academics between declarative and procedural knowledge, it stands head and shoulders above the usual PPP methodology by devoting no time at all to students creatively using the L2 for their own chosen communicative purposes. In a typical school term using Dr. Conti’s method, no student ever gets a speaking turn lasting for more than twenty seconds, except right at the end of the course, in the “Spontaneity” phase, where a single student was once recorded speaking, with only occasionally interruptions from the teacher, for over one minute.

Thus, Dr. Conti ensures that “the kids”, as he lovingly calls them, having worked their way through masses of sentence building, drills (expansion work to push output) and carefully guided, focused production of target items, are well-prepared for their end of tem exam which tests what they know about the bits of language they have been so thoroughly taught. They know what this sentence means:

The pen of my aunt is on the desk of my uncle.

They know how to pronounce each word. They know what’s wrong with the sentence

The pens of my aunt is on the desk of my uncle .

and with a bit of luck and enough time they know how to compose (“build” as Dr. Conti would say) this sentence

The wheel of my car is on the foot of my screaming daughter.

What they don’t know, of course, is how to use the L2 fluently in order to to take part in spontaneous, real-time, communicative exchanges with other people about things that matter to them. But, as Dr. Conti likes to say, “Less in more”.

Luckily for us, Dr. Conti has recently recorded “an impromptu, unplanned, unscripted summary of the EPI philosophy and principles”.

This tour de force is worth playing over and over again. Among the many remarkable features of the talk, Dr. Conti’s ability to continually recover from contradictions in what he’s saying with all the aplomb of a truly professional salesperson, and his powerful use of the fingers of his two hands. deserve special praise. Note how, in his “Count The Ways” (TM) routine, while the left hand serves to indicate the number of elements involved in the current topic being discussed, the thumb and index finger of his right hand are used to show which among the elements he is focusing on.

Dr. Conti begins by offering a daring interpretation of emergentism. Majestically sweeping aside the finer points of Nick Ellis’ work on emergentism, Dr. Conti suggests that “basically” (a key term in Dr. Conti’s oevre) what Ellis is saying is that language learning consists of getting bombarded with lexical chunks that are all basically the same. Every single situtation the learner finds themselves in is like “an attentional frame to a specific number of chunks”, and what Nick Ellis says (“born out by research, by science”) is that by being bombarded with these chunks that are all basically the same, “a phenomenon called priming happens whereby you are basically primed by this exposure ….. to then at some stage produce them”.

OK? Got that? If you’ve read Nick Ellis’ stuff, you might not recognise Dr. Conti’s description, but rest assured, he’s got a PhD in SLA, so he must be right. Now here comes the truly original twist, the bit that really seals the authority of the maestro.

Having stripped emergentism to the bone, Conti goes on to combine its raw principles with the principles of skills acquisition theory! I mean, how audacious can you get! It goes like this. First you get primed (the basic principle of emergentism), and then, and I quote: the important theory which kicks in after that is that then you’re going to start producing those chunks and you’re going to become fluent through trial and error, through feedback and a lot of practice. So when you have the two theories combined, you have a powerful synergy.

What most mundane scholars see as two completely contradictory theories, Dr. Conti sees as synergy! After a bit of a detour, Dr. Conti returns to these two theories which provide the principles for his method. He repeats that emergentism (usage based theory) gets you primed through massive exposure, and then skills theory gets you to practice so that you reach automaticity. “These two theories are the main tenets of my approach”.

Wow! Isn’t that amazing? As you know, I’m sure, the basic tenet of skills based theory is that learning begins with declarative knowledge, which can then be turned into procedural knowledge through practice. The usual way to describe skills based theory is to say that when you start learning something, you do so through largely explicit processes, after which, through practice and exposure, you move into implicit processes. So you go from declarative knowledge to procedural knowledge and the automatisation this brings. Declarative knowledge involves explicit learning or processes; learners obtain rules explicitly and have some type of conscious awareness of those rules. The automatization of procedural knowledge; learners proceduralise their explicit knowledge, and through suitable practice and use, the behaviour becomes automatic.

But Dr. Conti is, of course aware of the weaknesses of this theory.

  1. First, the lack of an operational definition undermines the various versions of skill acquisition theory that Conti has referred to: there is no agreed operational definition for the constructs “skill”, “practice”, or “automatization”. Partly as a result, but also because of methodological issues (see, for example, Dekeyser, 2007), the theory is under-researched; there is almost no empirical support for it.
  2. Second, millions of people who have emigrated to an English speaking country have learned English without any declarative or metalinguistic knowledge.
  3. Third, skill acquisition theory is in the “strong-interface” camp with regard to the vexed issue of the roles of explicit and implicit learning in SLA. It holds that explicit knowledge is transformed into implicit knowledge through the process of automatization as a result of practice. Many, including perhaps most famously Krashen, dispute this claim, and many more point to the fact that the theory does not take into account the role played by affective factors in the process of learning.  Practice, after all, does not always make perfect.
  4. Fourth, the practice emphasized in this theory is effective only for learning similar tasks: it doesn’t transfer to dissimilar tasks. Therefore, many claim that the theory disregards the role that creative thinking and behaviour plays in SLA.
  5. Fifth, to suggest that the acquisition of all L2 features starts with declarative knowledge is to ignore the fact that a great deal of vocabulary and grammar acquisition in an L2 involves incidental learning where no declarative stage is involved.
  6. Sixth, and perhaps most importantly, skill acquisition theory fails to deal with the sequences of acquisition which have been the subject of hundreds of studies in the last 50 years, all of them supporting the construct of interlanguages.

How to deal with these weaknesses? Only Dr. Conti could hit on bringing emergentist theories to the rescue. Without for a moment revising his method, which so obviously relies almost completely on the explicit teaching of pre-selected items of the L2, Dr. Conti says that priming gets learners ready for the practice bits of skills based pedagogy! So what Dr. Conti has done – as Marx did to a more modest degree with Hegel – is to stand two theories on their heads in such a way that his EPI method rests magically on the principles of two contradictory theories, and the limitations of both theories are surmounted. Needless to say, such is the audacity of this dialectic leap that nobody in the emergentist or skills based theory camps agrees with it. Nick Ellis, for whom language learning is an essentially implicit process, would not easily recognise Dr. Conti’s account of emergentism, and he most certainly would not endorse the MARS EARS sequence. On the other hand, neither Anderson nor DeKeyser would have any truck with Dr. Conti’s seemingly incoherent account of skills based theory. It’s hard to exaggerate the originality of Dr. Conti’s account as outlined in this truly fascinating off the cuff lecture.

Syllabus Design

We must accept that even the genius that is Dr. Conti has his weak spots, and I think his approach to syllabus design (which he refers to under the broader umbrella of curriculum design) needs some attention.

In his post “The seed-planting technique ..”,   Dr. Conti says:

effective teaching and learning cannot happen without effective curriculum design…… A well-designed language curriculum plans out effectively when, where and how each seed should be sown and the frequency and manner of its recycling with one objective in mind : that by the end of the academic year the course’s core language items are comprehended/produced effectively across all four language skills under real life conditions.

This amounts to what Breen (1987) calls a “Product” syllabus, what White calls a “Type A” syllabus and what Long (2011 and 2015) calls a “Synthetic” syllabus. The key characteristic of Conti’s “effective curriculum” is that, like all synthetic syllabuses, it concentrates on WHAT is to be learned. Dr. Conti’s syllabus specifies the content – he recommends concentrating on lexical chunks that can be used in the expression of communicative functions – “The Majestic 12” as he calls them. This content is presented and practiced in a pre-determined order, in such a way that planting “seeds” precedes the scheduled main presentation and subsequent recycling. Despite all Dr. Conti’s brilliant intellectual gynmnastics, his syllabus assumes that declarative knowledge is a necessary precursor to procedural knowledge, and second, it assumes that learners learn what teachers teach them, an assumption undermined by all the evidence from interlanguage studies. We know that learners, not teachers, have most control over their language development. As Long (2011) says:

Students do not – in fact, cannot – learn (as opposed to learn about) target forms and structures on demand, when and how a teacher or a coursebook decree that they should, but only when they are developmentally ready to do so. Instruction can facilitate development, but needs to be provided with respect for, and in harmony with, the learner’s powerful cognitive contribution to the acquisition process.

Even when presented with, and drilled in, target-language forms and structures, even when errors are routinely corrected, and even when the bits and pieces are “seeded” and recycled in various ways, learners’ acquisition of newly-presented forms and structures is rarely either categorical or complete, and it is thus futile to plan the curriculum of an academic year on the assumption that the course’s “core language items” will be “comprehended/produced effectively” by the end of the year. Acquisition of grammatical structures and sub-systems like negation or relative clause formation is typically gradual, incremental and slow, sometimes taking years to accomplish. Development of the L2 exhibits plateaus, occasional movement away from, not toward, the L2, and  U-shaped or zigzag trajectories rather than smooth, linear contours. No matter what the order or manner in which target-language structures and vocabulary are presented to them by teachers, learners analyze the input and come up with their own interim grammars, the product broadly conforming to developmental sequences observed in naturalistic settings. They master the structures in roughly the same manner and order whether learning in classrooms, on the street, or both. This led Pienemann to formulate his learnability hypothesis and teachability hypothesis: what is processable by students at any time determines what is learnable, and, thereby, what is teachable (Pienemann, 1984, 1989).


I hope you rushed quickly through the bit about syllabus design, and that your “take away” will be simple: Such is Dr. Conti’s genius, that, like Lewis Carroll’s Humpty Dumpty he can say what he likes.

‘When I use a word,’ Humpty Dumpty said, in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.’

‘The question is,’ said Alice, ‘whether you can make words mean so many different things.’

‘The question is,’ said Humpty Dumpty, ‘which is to be master — that’s all.’ (Carroll, 2009).

There can, surely be no question about who IS the master!


Breen, M. (1987) Learner contributions to task design. In C. Candlin and D. Murphy (eds.), Language Learning Tasks. Englewood Cliffs, N.J.: Prentice Hall. 23-46.

Carroll, L. (2009). Alice through the looking glass. Penguin.

Dekeyser, R. (2007) Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second language acquisition: An introduction (pp. 97-113). New Jersey: Lawrence Erlbaum Associates, Inc.

Long, M. (2011) “Language Teaching”. In Doughty, C. and Long, M. Handbook of Language Teaching. NY Routledge.

Long, M. (2015) SLA and TBLT. N.Y., Routledge.

Pienemann, M. (1984). Psychological constraints on the teachability of languages. Studies in Second Language Acquisition 6, 2, 186-214.

Pienemann, M. (1989). Is language teachable? Psycholinguistic experiments and hypotheses. Applied Linguistics 10, 1, 52-79.

White, R.V. (1988) The ELT Curriculum, Design, Innovation and Management.  Oxford: Basil Blackwell.