Jackendoff’s Representational Modularity Theory (Jackendoff, 1992) is a key component in Susanne Carroll’s Autonomous Induction Theory, as described in her book Input and Evidence (2001). Carroll’s book is too often neglected in the SLA literature, and I think that’s partly because it’s very demanding. Carroll goes into so much depth about how we learn; she covers so much ground in so much methodical detail; she’s so careful, so thorough, so aware of the complexities, that if you start reading her book without some previous understanding of linguistics, the philosophy of mind, and the history of SLA theories, you’ll find it very tough going. Even with some such understanding of these matters, I myself find the book extremely challenging. Furthermore, the text is dense and often, in my opinion, over elaborate; you have to be prepared to read the text slowly, and at the same time keep on reading while not at all sure where the argument’s going, in order to “get” what she’s saying.
One criterion for judging theories of SLA is an appeal to Occam’s Razor: ceteris paribus (all other things being equal), the theory with the simplest formula, and the fewest number of basic types of entity postulated, is to be preferred for reasons of economy. Carroll’s theory scores badly here: it’s complicated! Her use of Jackendoff’s theory, and of the Induction Theory of Holland et.al. means that her theory of SLA counts on a variety of formula and entities, and thus it’s not “economical”. On the other hand, it’s one of the most complete theories of SLA on offer.
Over the years, I’ve spent weeks reading Carroll’s Input and Evidence, and now, while reading it yet again in “lockdown”, I’m only just starting to feel comfortable turning from one page to the next. But it’s worth it: it’s a classic; one of the best books on SLA ever, IMHO, and I hope to persuade you of its worth in what follows. I’m going to present The Autonomous Induction Theory (AIT) in an exploratory way, bit by bit, and I hope we’ll end up, eventually, with some clear account of AIT and what it has to say about second language learning, and its implications for teaching.
To the issues, then.
In the current debate between Chomsky’s UG theory and more recent Usage-based (UB) theories of language and language learning, most of those engaged in the debate see the two theories as mutually contradictory: one is right and the other is wrong. One says language is an abstract system of form-meaning mappings governed by a grammar (in Chomsky’s case a deep grammar common to all natural languages as described in the Principles and Parameters version of UG), and this knowledge is learned with the help of innate properties of the mind. The other says language should be described in terms of its communicative function; as Saussure put it “linguistic signs arise from the dynamic interactions of thought and sound – from patterns of usage”. The signs are form-meaning mappings; we amass a huge collection of them through usage; and we process them by using relatively simple, probabilistic algorithms based on frequency.
O’Grady (2005) has this to say:
The dispute over the nature of the acquisition device is really part of a much deeper disagreement over the nature of language itself. On the one hand, there are linguists who see language as a highly complex formal system that is best described by abstract rules that have no counterparts in other areas of cognition. (The requirement that sentences have a binary branching syntactic structure is one example of such a “rule.”) Not surprisingly, there is a strong tendency for these researchers to favor the view that the acquisition device is designed specifically for language. On the other hand, there are many linguists who think that language has to be understood in terms of its communicative function. According to these researchers, strategies that facilitate communication – not abstract formal rules – determine how language works. Because communication involves many different types of considerations (new versus old information, point of view, the status of speaker and addressee, the situation), this perspective tends to be associated with a bias toward a multipurpose acquisition device.
Susanne Carroll tries to take both views into account.
Carroll agrees with Gregg (1993) that any theory of SLA has to consist of two parts:
1) a property thory which describes WHAT is learned,
2) a transition theory which explains HOW that knowledge is learned.
As regards the property theory, it’s a theory of knowledge of language, describing the mental representations that make up a learner’s grammar – which consists of various classifications of all the components of language and how they work together. What is it that is represented in the learner’s knowledge of the L2? Chomsky’s UG theory is an example; Construction grammar is another; The Competition Model of Bates & MacWhinney (1989, cited in Carroll, 2001) is another; while general knowledge representations, and forms of rules of discourse, Gricean maxims , etc. are, I suppose also candidates.
Transition theories of SLA explain how these knowledge states change over time. The changes in the learner’s knowledge, generally seen as progress towards a more complete knowledge of the target language, need to be explained by appeal to a causal mechanism by which one knowledge state develops into another.
Many of the most influential cognitive processing theories of SLA (Chaudron, 1985; Krashen, 1982; Sharwood Smith, 1986, Gass, 1997, Towell & Hawkins, 1994, cited in Carroll, 2001) concentrate on a transition theory. They explain the process of L2 learning in terms of the development of interlanguages , while largely ignoring the property theory, which they sometimes, and usually vagely, assume is dealt with by UG. New UB theories (e.g. Ellis, 2019; Tomesello, 2003) reject Chomsky’s UG property theory and rely on what Chomsky regards as performance data for a description of the language in terms of a Construction Grammar. More importantly, perhaps, their ‘transition theory’ makes a minimal appeal to the workings of the mind; they’re at pains to use quite simple general learning mechanisms to explain how “associative” learning, acting on input from the environment, explains language learning.
Carroll bases her approach on the view that humans have a unique, innate capacity for language, and that language learning goes on in a modular mind. Here, I’ll leave discussions about the philosophy of mind to one side, but suffice it to say for now that ‘mind’ is a theoretical construct referring to a human being’s world of thought, feeling, attitude, belief and imagination. When we talk about the mind, we’re not talking about a physical part of the body (the brain), and when we talk about a modular mind, we’re not talking about well-located, separate parts of the brain.
Carroll rejects Fodor’s (1983) claim that the language faculty comprises a single language module in the mind’s architecture, and she sees Chomsky’s LAD as an inadequate description of the language faculty. Rather than accept that language learning is crucially explained by the workings of a “black box”, Carroll explores the mechanisms of mind more closely, and, following Jackendoff, suggests that the language faculty operates at different levels, and is made up of a chain of mental representations, with the lowest level interacting with physical stimuli, and the highest level interacting with conceptual representations. Processing goes on at each level of representation, and a detailed description of these representations explains how input is processed for parsing.
Carroll further distinguishes between processing for parsing and processing for learning, such that, in speech, for example, when the parsers fail to get the message, the learning mechanisms take over. Successful parsing means that the processors currently at the learner’s disposal are able to use existing rules which categorize and combine representations to understand the speech signal. When the rules are inadequate or missing, parsing breaks down; and in order to deal with this breakdown, the known rule that helps most in parsing the problematic item of input is selected and subsequently adapted or refined until parsing succeeds at that given level. As Sun (2008) summarises “This procedure explains the process of acquisition, where the exact trigger for acquisition is parsing failure resulting from incomprehensible input”.
Scholars from Krashen to Gass take ‘input’ and ‘intake’ as the first two necessary steps in the SLA process (Gass’s model suggests that input passes through the stages of “apperceived” and “comprehended” input before becoming ‘intake’), and ‘intake’ is regarded as the set of processed structures waiting to be incorporated into interlanguage grammar. The widely accepted view that in order for input to become intake it has to be ‘noticed’, as described by Schmidt in his influential 1990 paper, has since, as the result of criticism (see, for example, Truscott, 1998) been seriously modified so that it now approximate to Gass’ ‘apperception’ (see Schmidt 2001, 2010), but it’s still widely seen as an important part of the SLA process.
Caroll, on the other hand, sees input as physical stimuli, and intake as a subset of this stimuli.
The view that input is comprehended speech is mistaken. Comprehending speech ..happens as a consequence of a successful parse of the speech signal. Before one can successfully parse the L2, one must learn it’s grammatical properties. Krashen got it backwards! (Carroll, 2001, p. 78).
Referring not just to Krashen, but to all those who use the constructs ‘input’, ‘intake’ and ‘noticing’, Gregg (in a comment on one of my blog posts) makes the obvious, but crucial point: “You can’t notice grammar”! Grammar consists of things like nouns and verbs, which, are, quite simply, not empirically observable things existing “out there” in the environment, waiting for alert, on-their-toes learners to notice them.
So, says Carroll, language learning requires the transformation of environmental stimuli into mental representations, and it’s these mental representations which must be the starting point for language learning. In order to understand speech, for example, properties of the acoustic signal have to be converted to intake; in other words, the auditory stimulus has to be converted into a mental representation. “Intake from the speech signal is not input to leaning mechanisms, rather it is input to speech parsers. … Parsers encode the signal in various representational formats” (Carroll, 2001, p.10).
Sorry for the poor quality of the scan.
We now need to look at Jackendoff ‘s (1992) Representational Modularity. Jackendoff presents a theory of mind which contrasts with Fodor’s modular theory (where the language faculty constitutes a single module which processes already formed linguistic representations) by proposing that particular types of representation are sets belonging to different modules. The language faculty has several autonomous representational systems and information flows in limited ways from a conceptual system into the grammar via correspodence rules which connect the autonomous representational systems (Carroll, 2001, p. 121).
Jackendoff’s model has various cognitive faculties, each associated with a chain of levels of representation. The stimuli are the “lowest” level of representation, and “conceptual structures” are the “highest”. The chains intersect at various points allowing information encoded in one chain to influence the information encoded in another. This amounts to Jackendoff’s hypothesis of levels.
Here’s a partial model
Jackendoff proposes that, in regard to language learning, the mind has three representational modules: phonology, syntax, and semantics, and that it also has interface modules which, by defining correspondence rules between representational formats, allow them to pass information along from the lowest to the highest level. This is important for Carroll, because, as we’ll see, the different modules are autonomous and so there must be a translation processor for each set of correspondence rules linking one autonomous representation type to another.
What Carroll wants from Jackendoff is “a clear picture of the functional architecture of the mind” (Carroll, 2001, p. 126), on which to build her induction model. In Part 2, I’ll deal with the Induction bit, but we must finish Part 1 by looking at other parts of Jackendoff’s work.
In The Architecture of the Language Faculty, Jackendoff argues for the central part played in language by the lexicon. The lexicon is not part of one of his representational modules, but rather the central component of the interface between them. Lexical items include phonological, syntactic, and semantic content, and thus any lexical item is a set of three structures linked by correspondence rules. Furthermore, since lexical items are part of this general interface, there is no need to restrict them to word-sized elements–they can be affixes, single words, compound words, or even whole constructions, including MWUs, idioms, and so on. As Stephenson (1997) says: Simply put, the claim is that what we call the lexicon is not a distinct entity but rather a subset of the interface relations between the three grammatical subsystems. … Jackendoff’s proposal thus has the potential to provide a uniform characterization of morphological, lexical, and phrase-level knowledge and processes, within a highly lexicalized framework.
To bring this home, I offer two presentations by Jackendoff. In the first presentation, Jackendoff argues that lexis only – “linear grammar” – paved the way for modern languages. It’s eloquent, to say the very least.
The main argument is, of course, the importance of the lexicon, but I think this diagram is particularly interesting.
Never mind the details, just that comprehending starts with percepual stimuli and goes through various levels of representation from lowest to highest, while speaking starts with responding to stimuli actively and goes in the opposite direction.
In the second presentation, Jackendoff talks about mentalism and formalism. Please skip to Minute 49.
In this presentation Jackendoff argues that we should abandon the assumption made by generative grammar that lexicon and grammar are fundamentally different kinds of mental representations. If the lexicon gets progressively more and more rule-like, and you erase the line between words and rules, then you slide down a slippery slope which ends up with HPSG (Head-driven phrase structure grammar), Cognitive Grammar, and Construction Grammar, which, he says, is “not so bad”.
So, we may well ask, is Jackendoff a convert to UB theories? How can he be, if he bases his theory of Representational Modularity on the assumption of our possession of a modular mind? How can all this ‘mental representation’ stuff be reconciled with an empiricist view like N. Ellis’ which wants to explain language learning almost exclusively in terms of input from the environment? Part of the answer is, surely, that UB theory has a lot more mental stuff going on than it cares to recognise, but, in any case, I hope we can explore this further in Part 2, and I’d be very pleased if it leads to a lively discussion.
To summarise then, Jackendoff (2000) replaces Chomsky’s generative grammar with the view that syntax is only one of several generative components. Lexical items are not, pace Chomsky, inserted into initial syntactic derivations, and then interpreted through processes of derivations, but rather, speech signals are processed by the auditory-to-phonology interface module to create a phonological representation. After that, the phonology-to-syntax interface creates a syntactic structure, which is then, aided by the syntax-to-semantics interface module, converted into a propositional structure, i.e. meaning. Which is why, when a lexical item becomes activated, it not only activates its phonology, but it also activates its syntax and semantics and thus “establishes partial structures in those domains” (Jackendoff, 2000: 25). The same but reversed process takes place in language production.
What does Suzanne Carroll make of it all? Can you make do with Netflix till the next exciting episode comes along??
Well, well. I hope you find this half as interesting as I do. Onward through the fog.
Carroll, S. (2001) Input and Evidence. Amsterdam, Bejamins.
Ellis, N. C. (2019). Essentials of a theory of language cognition. Modern Language Journal, 103.
Fordor, J. (1987) The Modularity of mind. Cambridge, MA, MIT Press.
Gregg. K.R. (1993) Taking Explanation seriously. Applied Linguistics, 14, 3.
Jackendoff, R.S. (1992) Language of the mind. Cambridge, Ma; MIT Press.
O’Grady, W. (2005) How Children Learn Language. Cambridge, UK: Cambridge University Press
Schmidt,R. (1990) The role of consciousness in second language learning. Applied Linguistics 11, 129–58.
Schmidt, R. (2001) Attention. In P. Robinson (Ed.), Cognition and second language instruction (pp.3-32). Cambridge University Press.
Schmidt, R. (2010) Attention, awareness, and individual differences in language learning. In W. M. Chan, S. Chi, K. N. Cin, J. Istanto, M. Nagami, J.W. Sew, T. Suthiwan, & I. Walker, Proceedings of CLaSIC 2010, Singapore, December 2-4 (pp. 721-737). Singapore: National University of Singapore, Centre for Language Studies.
Stevenson, S. (1997) A Review of The Architecture of the Language Faculty. Computational Linguistics, 24, 4.
Sun, Y.A. (2008) Input Processing in Second Language Acquisition: A Discussion of Four Input Processing Models. Working Papers in TESOL & Applied Linguistics, Vol. 8, No. 1.
Tomasello, M. (2003). Constructing a Language: A Usage-Based Theory of Language Acquisition. Cambridge, MA: Harvard University Press.
Truscott, John (1998). “Noticing in second language acquisition: a critical review” Second Language Research. 14 (2): 103–135.