Heart and Parcel

Heart and Parcel was founded in 2015 by Clare Courtney and Karolina Koścień with the aim of supporting people ​learning English in their local communities and “forging connections by developing English language and communication skills through the medium of food”. They run a variety of projects “which use food and cooking as a way for participants to gather together, connect and to share their past stories, experiences and lives with others, whilst practising and developing their English language skills”.

An Example

One of their most popular courses is “From Home to Home”, an online course which teaches English by giving a series of cooking classes. The “Main Class”, Cooking and English, is given every week, followed, two days later, by a “Post-class Discussion Class”, where students practice their English with other students and qualified teachers. “Extra Study” is provided by “Homework” (vocabulary and grammar exercises about the food and recipe from the main class) and a “WhatsApp Discussion Group” (share recipes, videos, photos of food and communicate in English with other students and qualified teachers).

More Good Stuff

Additionally, Heart and Parcel run fundraising public food events such as supper clubs, markets, catering, and private workshops. They also publish a collaborative cookbook “working with participants to share recipes and stories from their communities, cultures and lives”.

And it doesn’t stop there. They explore “learning opportunities through food” by looking at budgeting, healthy meals, using food for social change, employability and community cohesion through communal eating. As Clare told me, “Most importantly, we have found that building connections and networking has been wholly beneficial for the learners, being exposed to different settings, contexts with us (our learners can volunteer with us to do catering events, market stall cookalongs, presentations) thus creating further opportunities to meet people who are more within their interests, line of work or study”.  They are also committed to “learner progression through to paid positions in our project and the opportunity to teach and to develop the skills of newer learners who enter our programme.”

I won’t bother to say why I think this is such an inspirational project because I think it speaks for itself. It will be described and discussed in the forthcoming book on ELT by Mike Long and me, which I’ve mentioned a few times already. The best thing you can do is visit the website.

My thanks to Clare for taking the time to tell me about the thinking behind the project and how it’s developing.

The effects of multimodal input on second language learning

There’s a sudden buzz in SLA research – reports of studies on multimodal input abound, and a special issue of the Studies in SLA journal (42, 3, 2020) is a good example.  Another is the special issue of The Language Learning Journal (47, 2019). Below is a quick summary of the Introduction to the SSLA special issue. I’ve done little more than pick out bits of the text and strip them of the references, which are, of course, essential in giving support to the claims made. If you don’t have access to the journal, get in touch with me for any of the articles you want to read.


Mayer’s (2014) cognitive theory of multimedia learning states that learning is better when information is processed in spoken as well as written mode because learners make mental connections between the aural and visual information provided there is temporal proximity. Examples in the domain of language learning are

  • storybooks with pictures read aloud,
  • audiovisual input,
  • subtitled audiovisual input,
  • captioned audiovisual input,
  • glossed audiovisual input.

What these types of input have in common is the combination of pictorial information (static or dynamic) and verbal input (spoken and/or written). Most of these input types combine not two but three sources of input:

  1. pictorial information,
  2. written verbal information in captions or subtitles, or in written text, and
  3. aural verbal input.

It could be argued that language learners might experience cognitive overload when engaging with both pictorial and written information in addition to aural input. However, eye-tracking research has demonstrated that language learners are able to process both pictorial and written verbal information on the condition that they are familiar with the script of the foreign language.

In addition to imagery, there are other advantages inherent in multimodal input and audiovisual input in particular. Learners need fewer words to understand TV programs compared to books. Webb and Rodgers (2009a, 2009b) have put forward knowledge of the 3,000 most frequent word families and proper nouns to reach 95% coverage of the input. However, the lexical coverage figures for TV viewing have recently been found to be lower, so the lexical demands are not as high as for reading (knowledge of the 4,000 most frequent word families for adequate comprehension and 8,000 word families for detailed comprehension. Rodgers and Webb (2011) also established that words are repeated more often in TV programs than in reading, especially in related TV programs, which is beneficial for vocabulary learning. Another advantage is the wide availability of audiovisual input using the Internet and streaming platforms. It can, thus, easily provide language learners with large amounts of authentic language input (Webb, 2015). Finally, language learners are motivated to watch L2 television, as has been well documented in surveys on language learners’ engagement with the L2 outside of the school.


Previous research into language learning from multimodal input has focused on three main areas: comprehension, vocabulary learning, and, to a lesser extent, grammar learning. A consistent finding in this area is that audiovisual input is beneficial for comprehension, in particular when learners have access to captions. Captions assist comprehension by helping to break down speech into words and thus facilitating listening and reading comprehension. Crucially, a unique support offered to learners’ comprehension by multimodal input is imagery. Research into audiovisual input has shown that it can work as a compensatory mechanism especially for low-proficiency learners.

The bulk of research into multimodal input has focused on vocabulary learning. A seminal study on the effect of TV viewing on vocabulary learning is Neuman and Koskinen’s 1992 study. They were among the first to stress the potential of audiovisual input for vocabulary learning. It was not until 2009 that the field of SLA started to pay more attention to audiovisual input. Two key studies were the corpus studies by Webb and Rodgers (2009a, 2009b), which showed the lexical demands of different types of audiovisual input. They argued that in addition to reading, audiovisual input may also be a valuable source of input for language learners. Since then, the field of SLA has witnessed a steady increase in the number of studies investigating vocabulary learning from audiovisual input. While most  research into audiovisual input focused on the efficacy of captions, fewer studies  focused on noncaptioned and nonsubtitled audiovisual input. Research has also moved from using short, educational clips to using full-length TV programs.  Finally, in addition to studying the effectiveness of multimodal input for vocabulary learning, research has also started to study language learners’ processing of multimodal input (e.g., looking patterns of captions or pictures) by means of eye-tracking. Together, there seems to be robust evidence that language learners can indeed pick up unfamiliar words from multimodal input and that the provision of captions has the potential to increase the learning gains.

Research into the potential of multimodal input has been gaining traction, but the number of studies is still limited and mainly confined to vocabulary learning. Now that research into multimodal input is starting to broaden its focus to different aspects of learning as well as its research techniques, the present issue provides an up-to-date account of research in this area with a view to include innovative work and a range of approaches.

The special issue pursues new avenues in research into multimodal input by focusing on pronunciation, perception and segmentation skills, grammar, multiword units, and comprehension. In addition, it extends previous eye-tracking by investigating the effects of underresearched pedagogic interventions on learners’ processing of target items, target structures, and text. The studies nicely complement each other in their research methodologies and participant profiles. The special issue comprises six empirical studies and one concluding commentary.

  • Different types of input (TV viewing with and without L1 or L2 subtitles, reading-while-listening, reading, listening);
  • Different types of captioning (unenhanced, enhanced, no captioning);
  • Different components of language learning (single words, formulaic sequences, comprehension, grammar, pronunciation);
  • Different mediating learner- and item-related factors (e.g., working memory, prior vocabulary knowledge, frequency of occurrence);
  • Different learning conditions (incidental learning, intentional learning, experimental and classroom-based) and time conditions (short video clips, full-length TV programs, extensive viewing);
  • Different research tools (eye-tracking, productive and receptive vocabulary tests, comprehension tests).

I should say that I’ve excluded the parts on grammar learning. Here’s an extract:

Research into grammar learning through multimodal input is very scarce. More recent studies involving captions and grammar in longer treatments have provided evidence of positive benefits for L2 grammar development in adults, especially when captions are textually enhanced. However, results have not been similarly positive for all target structures, suggesting the influence of other factors such as the structure-specific saliency of a grammar token.

This sudden surge of interest in multimodal input is obviously, in part anyway, a response to the growth of on-line teaching forced on us by the Covid 19 pandemic. To me, it looks like a very promising development, particularly as a possible answer to the question of how to tackle the need to encourage inductive “chunk” learning.