Emergentism is an umbrella term referring to a fast growing range of usage-based theories of SLA which adopt “connectionist” and associative learning views, based on the premise that language emerges from communicative use. Many proponents of emergentism, not least the imaginative Larsen-Freeman, like to begin by pointing to the omnipresence of complex systems which emerge from the interaction of simple entities, forces and events. Examples are:
The chemical combination of two substances produces, as is well known, a third substance with properties different from those of either of the two substances separately, or both of them taken together. Not a trace of the properties of hydrogen or oxygen is observable in those of their compound, water. (Mill 1842, cited in O’Grady, 2021).
Bee hives, with their carefully arranged rows of perfect hexagons, far from providing evidence of geometrical ability in bees actually provides evidence for emergence – The hexagonal shape maximizes the packing of the hive space and the volume of each cell and offers the most economical use of the wax resource… The bee doesn’t need to “know” anything about hexagons. (Elman, Bates, Johnson, Karmiloff-Smith Parisi & Plunkett, 1996, cited in O’Grady, 2021).
Larsen-Freeman’s own favorite is a murmuration of starlings, as in the photo, above. In her plenary at the IATEFL 2016 conference, the eminent scholar seemed almost to float away herself, up into the rafters of the great hall, as she explained:
Instead of thinking about reifying and classifying and reducing, let’s turn to the concept of emergence – a central theme in complexity theory. Emergence is the idea that in a complex system different components interact and give rise to another pattern at another level of complexity.
A flock of birds part when approached by a predator and then they re-group. A new level of complexity arises, emerges, out of the interaction of the parts.
All birds take off and land together. They stay together as a kind of superorganism. They take off, they separate, they land, as if one.
You see how that pattern emerges from the interaction of the parts?
Personally, I fail to grasp the force of this putative supporting evidence for emergentism, which strikes me as unconvincing, not to say ridiculous. I find the associated claim that complex systems exhibit ‘higher-level’ properties which are neither explainable, nor predictable from ‘lower-level’ physical properties, but which, nevertheless have causal and hence explanatory efficacy slightly less ridiculous, but still unconvincing, and surely hard to square with empiricist principles. So, moving quickly on, let’s look at emergentist theories of language learning. Note that the discussion is mostly of Nick Ellis’ theory of emergentism, which he applies to SLA.
What Any Theory of SLA Must Explain
Kevin Gregg (1993, 1996, 2000, 2003) insists that any theory of SLA should do two things: (1) describe what knowledge is acquired (a property theory describing what language consists of and how it’s organised), and (2) explain how that knowledge is acquired (a causal transition theory ). Chomsky’s principles and parameters theory offers a very technical description of “Universal Grammar”, consisting of clear descriptions of grammar principles which make up the basic grammar of all natural languages, and the parameters which apply to particular languages. It describes what Chomsky calls “linguistic competence” and it has served as a fruitful property theory guiding research for more than 50 years. How is this knowledge acquired? Chomsky’s answer is contained in a transition theory that appeals to an innate representational system located in a module of the mind devoted to language, and by innate mechanisms which use that system to parse input from the environment, set parameters, and learn how the particular language works.
But UG has come under increasing criticism. Critics suggest that UG principles are too abstract, that Chomsky has more than once moved the goal posts, that the “Language Acquisition Device is a biologically implausible “black box”, that the domain is too narrow, and that we now have better ways to explain the phenomena that UG theory tackles. Increasingly, emergentist theories are regarded as providing better explanations.
There is quite a collection of emegentist theories, but we can distinguish between emergentists who rely on associative learning, and those who believe that “achieving the explanatory goals of linguistics will require reference to more just transitional probabilities” (O’Grady, 2008, p. 456). In this first post, I’ll concentrate on the first group, and refer mostly to the work of its leading figure, Nick Ellis. The reliance on associative learning leads to this group often being referred to as “empiricist emergentists”.
Empiricist emergentists insist that language learning can be satisfactorily explained by appeal to the rich input in the environment and simple learning processes based on frequency, without having to resort to abstract representations and an unobservable “Language Acquisition Device” in the mind.
Regarding the question of what knowledge is acquired, the emergentist case is summarised by Ellis & Wulff (2020, p. 64-65).
The basic units of language representation are constructions. Constructions are pairings of form and meaning or function. Words like squirrel are constructions: a form — that is, a particular sequence of letters or sounds — is conventionally associated with a meaning (in the case of squirrel, something like “agile, bushy-tailed, tree-dwelling rodent that feeds on nuts and seeds)”.
In Construction Grammar, constructions, are wide-ranging. Morphemes, idiomatic expressions, and even abstract syntactic frames are constructions:
sentences like Nick gave the squirrel a nut, Steffi gave Nick a hug, or Bill baked Jessica a cake all have a particular form (Subject-Verb-Object-Object) that, regardless of the specific words that realize its form, share at least one stable aspect of meaning: something is being transferred (nuts, hugs, and cakes).
Furthermore, some constructions have no meaning – they serve more functional purposes;
passive constructions, for example, serve to shift what is in attentional focus by defocusing the agent
of the action (compare an active sentence such as Bill baked Jessica a cake with its passive counterpart A cake was baked for Jessica).
constructions can be simultaneously represented and stored in multiple forms and at various levels of abstraction: table + s = tables; [Noun] + (morpheme -s) = “plural things”). Ultimately, constructions blur the traditional distinction between lexicon and grammar. A sentence is not viewed as the application of grammatical rules to put a number of words obtained from the lexicon in the right order; a sentence is instead seen as a combination of constructions, some of which are simple and concrete while others are quite complex and abstract. For example, What did Nick give the squirrel? comprises the following constructions:
• Nick, squirrel, give, what, do constructions
• VP, NP constructions
• Subject-Verb-Object-Object construction
• Subject-Auxiliary inversion construction
We can therefore see the language knowledge of an adult as a huge warehouse of constructions.
As to language learning, it is not about learning abstract generalizations, but rather about inducing general associations from a huge collection of memories: specific, remembered linguistic experiences.
The learner’s brain engages simple learning mechanisms in distributional analyses of the exemplars of a given form-meaning pair that take various characteristics of the exemplar into consideration, including how frequent it is, what kind of words and phrases and larger contexts it occurs with, and so on” (Ellis & Wulff, 2020, p. 66).
The “simple learning mechanisms” amount to associative learning. The constructions are learned through “the associative learning of cue-outcome contingencies” determined by factors relating to the form, the interpretation, the contingency of form and function; and learner attention. Language learning involves “the gradual strengthening of associations between co-occurring elements of the language”, and fluent language performance involves “the exploitation of this probabilistic knowledge” (Ellis, 2002, p. 173). Based on sufficiently frequent cues pairing two elements in the environment, the learner abstracts to a general association between the two elements.
Here’s how it works:
When a learner notices a word in the input for the first time, a memory is formed that binds its features into a unitary representation, such as the phonological sequence /wʌn/or the orthographic sequence one. Alongside this representation, a so-called detector unit is added to the learner’s perceptual system. The job of the detector unit is to signal the word’s presence whenever its features are present in the input. Every detector unit has a set resting level of activation and some threshold level which, when exceeded, will cause the detector to fire. When the component features are present in the environment, they send activation to the detector that adds to its resting level, increasing it; if this increase is sufficient to bring the level above threshold, the detector fires. With each firing of the detector, the new resting level is slightly higher than the previous one—the detector is primed. This means it will need less activation from the environment in order to reach threshold and fire the next time. Priming events sum to lifespan-practice effects: features that occur frequently acquire chronically high resting levels. Their resting level of activation is heightened by the memory of repeated prior activations. Thus, our pattern-recognition units for higher-frequency words require less evidence from the sensory data before they reach the threshold necessary for firing. The same is true for the strength of the mappings from form to interpretation. Each time /wʌn/ is properly interpreted as one, the strength of this connection is incremented. Each time /wʌn/ signals won, this is tallied too, as are the less frequent occasions when it forewarns of wonderland. Thus, the strengths of form-meaning associations are summed over experience. The resultant network of associations, a semantic network comprising the structured inventory of a speaker’s knowledge of language, is tuned such that the spread of activation upon hearing the formal cue /wʌn/ reflects prior probabilities of its different interpretations (Ellis & Wulff, 2020, p. 67).
The authors add that other additional factors need to be taken into account, and this one is particularly important:
..… the relationship between frequency of usage and activation threshold is not linear but follows a curvilinear “power law of practice” whereby the effects of practice are greatest at early stages of learning, but eventually reach asymptote.
Evidence supporting this type of emergentist theory is said to be provided by IT models of associative learning processes in the form of connectionist networks. For example, Lewis & Elman’s (2001) demonstration that a Simple Recurrent Network (SRN) can, among other things, simulate the acquisition of agreement in English from data similar to the input available to children, and the connectionist model reported in Ellis and Schmidt’s 1997 and 1998 papers is another.
There have been various criticisms of the empiricist version of emergentism as championed by Ellis, and IMHO, the articles by Eubank & Gregg (2002), and Gregg (2003) remain the most acute. I’ll use them as the basis for what follows.
a) Linguistic knowledge
Regarding their description of the linguistic knowledge acquired, Gregg (2003) points out that emergentists are yet to agree on any detailed description of linguistic knowledge, or even whether such knowledge exists. The doubt about whether or not there’s any such thing as linguistic knowledge is raised by extreme empiricists, such as the logical positivists and behaviourists discussed in my last post, and also the eliminativists involved in connectionist networks, who all insist that the only knowledge we have comes through the senses, representational knowledge of the sort required to explain linguistic competence is outlawed. Ellis and his colleagues don’t share the views of these extremists; they accept that linguistic representations – of some sort or other – are the basis of our language capacity, but they reject any innate representations, and therefore, they need to not just describe the organisation of the representations, but also to explain how the representations are learned from input from the environment.
O’Grady (2011) agrees with Gregg about the lack of consensus among emergentists as to what form linguistic knowledge takes; some talk of local associations and memorized chunks (Ellis 2002), others of a construction grammar (Goldberg 1999, Tomasello 2003), and others of computational routines (O’Grady 2001, 2005). Added to a lack of consensus is a lack of clarity and completeness. O’Grady’s discussion of Lewis & Elman’s (2001) Simple Recurrent Network (SRN), mentioned above, explains how it was able to mimic some aspects of language acquisition in children, including the identification of category-like classes of words, the formation of patterns not observed in the input, retreat from overgeneralizations, and the mastery of subject-verb agreement. However, O’Grady goes on to say that it raises the question of why the particular statistical regularities exploited by the SRN are in the input in the first place.
In other words, why does language have the particular properties that it does? Why, for example, are there languages (such as English) in which verbs agree only with subjects, but no language in which verbs agree only with direct objects?.
Networks provide no answer to this sort of question. In fact, if presented with data in which verbs agree with direct objects rather than subjects, an SRN would no doubt “learn” just this sort of pattern, even though it is not found in any known human language.
There is clearly something missing here. Humans don’t just learn language; they shape it. Moreover, these two facts are surely related in some fundamental way, which is why hypotheses about how linguistic systems are acquired need to be embedded within a more comprehensive theory of why those systems (and therefore the input) have the particular properties that they do. There is, simply put, a need for an emergentist theory of grammar. (O’Grady, 2011, p. 4).
In conclusion, then, some leading emergentists themselves agree that emergentism has not, so far, offered any satisfactory description of the knowledge of the linguistic system that is required of a property theory. An unfinished construction grammar that is brought to bear on “a huge collection of memories, specific, remembered linguistic experiences”, seems to be as far as they’ve got.
Whatever the limitations of the emergentists’ sketchy account of linguistic knowledge might be, their explanation of the process of language learning (which is, after all, their main focus) seems to have more to recommend it, not least its simplicity. In the case of empiricist emergentists, the explanation relies on associative learning: learners make use of simple cognitive mechanisms to implicitly recognise frequently-occurring associations among elements of language found in the input. To repeat what was said above, the theory states that constructions are learned through the associative learning of cue-outcome contingencies. Associations between co-occurring elements of language found in the input are gradually strengthened by successive encounters, and, based on sufficiently frequent cues pairing these two elements, the learner abstracts to a general association between them. To this simplest of explanations, a few other elements are attached, not least the “power law of practice”. In his 2002 paper on frequency effects in language processing, Ellis cites Kirsner (1994)’s claim that the strong effects of word frequency on the speed and accuracy of lexical recognition are explained by the power law of learning,
which is generally used to describe the relationships between practice and performance in the acquisition of a wide range of cognitive skills. That is, the effects of practice are greatest at early stages of learning, but they eventually reach asymptote. We may not be counting the words as we listen or speak, but each time we process one there is a reduction in processing time that marks this practice increment, and thus the perceptual and motor systems become tuned by the experience of a particular language (Ellis, 2002, p. 152).
Eubank & Gregg (2002, p. 239) suggest that there are many areas of language learning which the emergentist explanation can’t explain. For example:
Ellis aptly points to infants’ ability to do statistical analyses of syllable frequency (Saffran et al., 1996); but of course those infants haven’t learned that ability. What needs to be shown is how infants uniformly manage this task: why they focus on syllable frequency (instead of some other information available in exposure), and how they know what a syllable is in the first place, given crosslinguistic variation. Much the same is true for other areas of linguistic import, e.g. the demonstration by Marcus et al. (1999) that infants can infer rules. And of course work by Crain, Gordon, and others (Crain, 1991; Gordon, 1985) shows early grammatical knowledge, in cases where input frequency could not possibly be appealed to. . All of which is to say, for starters, that such claims as that “learners need to have processed sufficient exemplars.” (p.40) are either outright false, or else true only vacuously (if “sufficient” is taken to range from as low a figure as 1).
Eubank & Gregg (2002, p. 240) also question emergentist use of key constructs. For example:
The Competition Model, for instance, relies heavily on the frequency (and reliability) of so-called “cues”. The problem is that it is nowhere explained just what a cue is, or what could be a cue; which is to say that the concept is totally vacuous (Gibson, 1992). In the absence of any principled characterization of the class of possible cues, an explanation of acquisition that appeals to cue-frequency is doomed to arbitrariness and circularity . (The same goes, of course, for such claims as Ellis’s [p.54] that “the real stuff of language acquisition is the slow acquisition of form-function mappings,” in the absence of any criterion for what counts as a possible function and what counts as a possible form.)
In his (2003) article, Gregg has more to say about cues:
The question then arises, What is a cue, that the environment could provide it? Ellis, for example, says, ‘in the input sentence “The boy loves the parrots,” the cues are: preverbal positioning (boy before loves), verb agreement morphology (loves agrees in number with boy rather than parrots), sentence initial positioning and the use of the article the)” (1998: 653). In what sense are these ‘cues’ cues, and in what sense does the environment provide them? What the environment can provide, after all, is only perceptual information, for example, the sounds of the utterance and the order in which they are made. (Emphasis added.) So in order for ‘ boy before loves’ to be a cue that subject comes before verb, the learner must already have the concepts subject and verb. But if subject is one of the learner’s concepts, on the emergentist view, he or she must have learned that; the concept subject must ’emerge from learners’ lifetime analysis of the distributional characteristics of the language input,’ as Ellis (2002a: 144) puts it (Gregg, 2003, p. 120).
Gregg (2003) goes to some length to critique the connectionist model reported in Ellis and Schmidt’s 1997 and 1998 papers. The model was made to investigate “adult acquisition of second language morphology using an artificial second language in which frequency and regularity were factorially combined” (1997, p. 149). The experiment was designed to test “whether human morphological abilities can be understood in terms of associative processes’ (1997, p. 145) and to show that “a basic principle of learning, the power law of practice, also generates frequency by regularity interactions” (1998, p. 309). The authors claimed that the network learned both the singular and plural forms for 20 nonce nouns, and also learned the ‘regular’ or ‘default’ plural prefix. In subsequent publications, Ellis claimed that the model gives strong support to the notion that acquisition of morphology is a result of simple associative learning principles and that the power law applies to the acquisition of morphosyntax. Gregg’s (2003) paper does a thorough job of refuting these claims.
Gregg begins by pointing out that connectionism itself is not a theory, but rather a method, “which in principle is neutral as to the kind of theory to which it is applied”. He goes on to point out the severe limitations of the Ellis and Schmidt experiment. In fact, the network didn’t learn the 20 nouns, or the 11 prefixes; it merely learned to associate the nouns with the prefixes (and with the pictures) – it started with the 11 prefixes, and was trained such that only one prefix was reinforced for any given word. Furthermore, the model was slyly given innate knowledge!
Although Ellis accepts that linguistic representations – of some sort or other – are the basis of our language capacity, he rejects the nativist view that the representations are innate, and therefore he needs to explain how the representations are acquired. In the Ellis & Schmidt model, the human subjects were given pictures and sounds to associate, and the network was given analogous input units to associate with output units. But, while the human participants in the experiment were shown two pictures and were left to infer plurality (rather than, say, duality or repetition or some other inappropriate concept), the network was given the concept of plurality free as one of the input nodes (and was given no other concept). (Emphasis added.) Gregg comments that while nativists who adopt a UG view of linguistic knowledge can easily claim that the concept of plurality is innate, Ellis cannot do so, and thus he must explain how the concept of plurality has been acquired, not just make it part of the model’s structure. So, says Gregg, the model is “fudging; innate knowledge has sneaked in the back door, as it were”. Gregg continues:
Not only that, but it seems safe to predict that the human subjects, having learned to associate the picture of an umbrella with the word ‘broil’, would also be able to go on to identify an actual umbrella as a ‘broil’, or a sculpture or a hologram of an umbrella as representations of a ‘broil’. In fact, no subject would infer that ‘broil’ means ‘picture of an umbrella’. And nor would any subject infer that ‘broil’ meant the one specific umbrella represented by the picture. But there is no reason whatever to think that the network can make similar inferences (Gregg, 2003, p. 114).
Emergentism and Instructed SLA
Ellis and others who are developing emergentist theories of SLA stress that, at least for monolingual adults, the process of SLA is significantly affected by the experience of learning ones’ native language. Children learn their first language implicitly, through associative learning mechanisms acting on the input from the environment, and any subsequent learning of more lanaguages is similar in this respect. However, monolingual adult L2 learners “suffer” from the successful early learning of their L1, because the success results in implicit input processing mechanisms being set for the L1, and the knock-on effect is that the entrenched L1 processing habits work against them, leading them to apply entrenched habits to an L2 where they do not apply. Ellis argues that the filtering of L2 input to L1-established attractors leads to adult learners failing to acquire certain parts of the L2, which are referred to as its “fragile” features (a term coined by Goldin-Meadow, 1982, 2003). Fragile features are non-salient – they pass unnoticed – and they are identified as being one or more of infrequent, irregular, non-syllabic, string-internal, semantically empty, and communicatively redundant.
Ellis (2017) (supported by Long, 2015), suggests that teachers should use explicit teaching to facilitate implicit learning, and that the principle aim of explicit teaching should be to help learners modify entrenched automatic L1 processing routines, so as to alter the way subsequent L2 input is processed implicitly. The teacher’s aim should be to help learners to consciously pay attention to a new form, or form–meaning connection and to hold it in short-term memory long enough for it to be processed, rehearsed, and an initial representation stored in long-term memory. Nick Ellis (2017) calls this “re-setting the dial”: the new, better exemplar alters the way in which subsequent exemplars of the item in the input are handled by the default implicit learning process.
It’s interesting to see what Long (2015, p. 50) says in his major work on SLA and TBLT:
A plausible usage-based account of (L1 and L2) language acquisition (see, e.g., N.C. Ellis 2007a,b, 2008c, 2012; Goldberg & Casenhiser 2008; Robinson & Ellis 2008; Tomasello 2003), with implicit learning playing a major role, begins with initially chunk-learned constructions being acquired during receptive or productive communication, the greater processability of the more frequent ones suggesting a strong role for associative learning from usage. Based on their frequency in the constructions, exemplar-based regularities and prototypical morphological, syntactic, and other patterns – [Noun stem-PL], [Base verb form-Past], [Adj Noun], [Aux Adv Verb], and so on – are then induced and abstracted away from the original chunk-learned cases, forming the basis for attraction, i.e., recognition of the same rule-like patterns in new cases (feed-fed, lead-led, sink-sank-sunk, drink-drank-drunk, etc.), and for creative language use.
In sum, …….., while incidental and implicit learning remain the dominant, default processes, their reduced power in adults indicates an advantage, and possibly a necessity (still an open question), for facilitating intentional initial perception of new forms and form–meaning connections, with instruction (focus on form) important, among other reasons, for bringing new items to learners’ focal attention. Research may eventually show such “priming” of subsequent implicit processing of those forms in the input to be unnecessary. Even if that turns out to be the case, however, opportunities for intentional and explicit learning are likely to speed up acquisition and so becomes a legitimate component of a theory of ISLA, where efficiency, not necessity and sufficiency, is the criterion for inclusion.
It should be obvious from the earlier discussion above that I’m persuaded by the criticisms of Eubank, Gregg, O’Grady (and many others!) to reject empricist emergentism as a theory of SLA, and I confess to having felt surprised when I first read the quotation above. Never mind. What I think is interesting is that a different explanation of SLA – one which allows for innate knowledge, a “bootstrapping” view of the process of acquisition, and interlanguage development – has some important things in common with emergentism, which can be incorportated into a theory of ISLA (Instructed Second Language Acquisition). Such a theory needs to look more carefully at the effects of different syllabuses, materials and teacher interventions on students learning in different environments, in order to assess their efficacy, but I’m sure it will begin with the commonly accepted view among SLA scholars that, regardless of context, implicit learning drives SLA, and that explicit instruction can best be seen as a way of speeding up this implicit learning.
At the root of the problem of any empiricist account is the poverty of the stimulus argument. Gregg (2003, p. 101) summarises Laurence and Margolis’ (2001: 221) “lucid formulation” of it:
1. An indefinite number of alternative sets of principles are consistent with the regularities found in the primary linguistic data.
2. The correct set of principles need not be (and typically is not) in any pre-theoretic sense simpler or more natural than the alternatives.
3. The data that would be needed for choosing among those sets of principles are in many cases not the sort of data that are available to an empiricist learner.
4. So if children were empiricist learners they could not reliably arrive at the correct grammar for their language.
5. Children do reliably arrive at the correct grammar for their language.
6. Therefore children are not empiricist learners.
By adopting an associative learning model and an empiricist epistemology (where some kind of innate architecture is allowed, but not innate knowledge, and certainly not innate linguistic representations), emergentists have a very difficult job explaining how children come to have the linguistic knowledge they do. How can general conceptual representations acting on stimuli from the environment explain the representational system of language that children demonstrate? I don’t think they can.
In the next post, I’ll discuss William O’Grady’s version of emergentism.
Bates, E., Elman, J., Johnson, M., Karmiloff-Smith, A., Parisi, D., & Plunkett, K. (1998). Innateness and emergentism. In W. Bechtel & G. Graham (Eds.), A companion to cognitive science (pp. 590-601). Basil Blackwell.
Ellis, N. (2002) Frequency effects in language processing: A Review with Implications for Theories of Implicit and Explicit Language Acquisition. Studies in SLA, 24,2, 143-188.
Ellis, N. (2015) Implicit AND Explicit Language Learning: Their dynamic interface and complexity. In Rebuschat, P. (Ed.). (2015). Implicit and explicit learning of languages, (pp. 3-23). Amsterdam: John Benjamins.
Ellis, N., & Schmidt, R. (1997). Morphology and longer distance dependencies: Laboratory Research Illuminating the A in SLA. Studies in Second Language Acquisition, 19(2), 145-171
Ellis, N. & Wulff, S. (2020) Usage-based approaches to l2 acquisition. In (Eds) VanPatten, B., Keating, G., & Wulff, S. Theories in Second Language Acquisition: An Introduction. Routledge.
Eubank, L. and Gregg, K. R. (2002) News Flash – Hume Still Dead. Studies in Second Language Acquisition, 24, 2, 237-248.
Gregg, K. R. (1993). Taking explanation seriously; or, let a couple of flowers bloom. Applied Linguistics 14, 3, 276-294.
Gregg, K. R. (1996). The logical and developmental problems of second language acquisition. In Ritchie, W.C. and Bhatia, T.K. (eds.) Handbook of second language acquisition. Academic Press.
Gregg, K. R. (2000). A theory for every occasion: postmodernism and SLA. Second Language Research 16, 4, 34-59.
Gregg, K. R. (2001). Learnability and SLA theory. In Robinson, P. (Ed.) Cognition and Second Language Instruction. CUP.
Gregg, K. R. (2003) The State of Emergentism in Second Language Acquisition. Second Language Research, 19, 2, 95-128.
O’Grady, W., Lee, M. & Kwak, H. (2011) Emergentism and Second Language Acquisition. In W. Ritchie & T. Bhatia (eds.), Handbook of Second Language Acquisition. Emerald Press.
O’Grady, W.(2011). Emergentism. In Hogan, P. (ed). The Cambridge Encyclopedia of Language Sciences, Cambridge University Press.
Seidenburg, M. and Macdonald, M. (1997) A Probabilistic Constraints Approach to Language Acquisition and Processing. Cognitive Science, 23, 4, 569–588.