The Cognitive Underpinnings of Active Multimodal Learning

[approximately 1750 Words]

This is a somewhat atypical blog post, although it does follow an oft-repeated pattern. To wit, my being inspired by an assigned task in (yet another!) MOOC that I’m taking. The course in question is “e-Learning Ecologies: Innovative Approaches to Teaching and Learning for the Digital Age,” taught by William (Bill) Cope and Mary Kalantzis, of U of Illinois Urbana-Champaign. It is atypical in the sense that I’m not just musing or thinking through the issue posed. Instead, I’m taking a position, making an argument. So… what I’m going to say here amounts to an evidence-based reaffirmation of some of the propositions that Cope and Kalantzis set forth in the course and elsewhere, specifically, their insistence on the importance and the real value of multimodal learning. In what follows, I’ll talk a bit about multimodality per se, then talk about a cognitive-science-informed understanding of learning, with arguments and evidence in favor of active multimodal learning. Finally, I’ll offer links to additional information and perspectives, research and websites. In short, I will argue that using multiple modes during active learning is more effective than limiting modalities during the learning process. NB: The following is an revised/adapted version of the post that I made to the Common Ground Scholar site that is associated with the MOOC. The original post may be seen here:


One of the first things that one notices on investigating the notions of multimodality or multimodal learning is that there is a plethora of groupings of “modalities.” For Bill Cope and Mary Kalantzis (e-Learning Ecologies MOOC, week 2, second video, affordance 3B,, there are six (or seven) modalities facilitated by digital learning: textual, spoken, sound, visual, tactile/spatial and gestural. For others, though, the modalities that enter into learning can vary. For the folks behind the VARK packaging of pedagogy and “learning styles”, there are four modalities (visual, auditory, read/write and kinesthetic). For Kate O’Halloran, in this scholarly article on “multimodal discourse analysis”, the modes are essentially textual and visual. In this other discussion of the application of Michael Halliday’s Systemic Functional theory of language to multimodality, the range is somewhat broader, since it mentions language (presumably both spoken and written), gesture, proxemics, image, and layout. In a writing textbook that purports to guide students toward effectively expressing themselves multimodally, the authors lay out exactly five “modes of communication”: visual, aural, linguistic, spatial and gestural (Arola, Ball and Sheppard 2014, 4), even though the book itself focuses primarily on writing combined with visual elements (image, layout). Different lines of research in psychology or linguistics point toward yet other sets of modalities, like this lab focusing on “multimodal language” (which includes both gestures and sign language), or this paper, which summarizes research on “Cognition, multimodal interaction and new media,” including gestures, gazing, interpreting visual cues, forming mental images, writing, and so forth. In short, there are a number of different ways of define the sensory channels, dimensions of meaning or genres of expression that the term “multimodal” refers to. In short, a “mode” amounts to any socially recognized channel through which meaning can be expressed or interpreted. Multimodality, then, is any combination of two or more distinct channels of communication.

Learning, from a Cognitive Science Perspective

What we are beginning to learn, as we explore the functioning of the brain using increasingly precise tools and techniques, like the functional magnetic resonance imaging (or fMRI), is that the dynamic human brain is extremely complex. Many distinct neural circuits are implicated in a complex and coordinated way to perform what seem like simple or unitary functions. Reading, for example, seems like it is just one kind of cognitive activity. However, it involves a multitude of interconnecting functions that use different neural circuits, including those that transmit stimuli to the brain, that transform such input into perception of marks on the page, that recognize the patterns of marks as meaningful, that assign particular meanings to particular sets of marks (words, sentences, paragraphs, punctuation, blank spaces), and still other circuits that then juggle a multiplicity of meanings to form a coherent internalized version of the intended or presumed message, and still other neural pathways and brain structures function in complex ways to connect that interpretation to other textual or paratextual messages, and/or to previous learning and/or to current understandings and perceptions of the world. Reading, in and of itself, is a highly complex and multimodal activity. So are activities like conversing with a fellow student, viewing a video, listening to a song, drawing, laughing, navigating a crowded street, or simply walking, etc. Our brain constantly manages — and harmonizes — a multitude of neural modalities: sensory inputs, channels of perception, interpretation, emotion, volition and action. And we are able to manage all of these complex neuro-somatic activities, which we do continuously, because we do them largely without thinking about it. The brain is working hard, constantly and automatically, just to sustain life, bodily functioning, cognition and consciousness.

When it comes to learning, what we mean by “multimodality” is a bit different from what I called neural modalities. In learning, the “modes” of meaning or modes of communication are the complex groupings of cognitive functions that our brain can do while on autopilot. Generally speaking, we do not have to consciously recall a large set of rules about writing before we mark symbols or words on a page or begin typing at a keyboard. We do not need to consciously recall all of the rules of long narrative genres as we navigate from sentence to sentence or paragraph to paragraph in Pride and Prejudice or Gravity’s Rainbow. Much of the intellectual work that we do as thinkers, writers or learners draws on our previous learning, our previous training and our previous habitual conditioning. In short, we depend on what have become ingrained patterns of behavior or knowledge.

On the other hand, new learning is the setting up and reinforcing of a new neural pattern. It is the creation of a novel set of interrelated brain processes that trace preliminary synaptic connections. For learning to be effective in the long term, those connections must be reinforced until they “stick.” That is, the new learning must initially take place largely outside of “autopilot mode” and must be consciously rehearsed. Indeed, as it turns out, one of the conditions that makes for effective long-term learning is a high level of cognitive effort, or to put it another way, learning in a way that requires struggle. The more conscious effort we put into the process of inscribing the new neural patterns of our learning, the more likely it is that such learning will remain retrievable to our conscious mind. This is one of the fundamental ideas in Make It Stick: The Science of Successful Learning (Brown, Roedinger & McDaniel 2014). Now, one way to make new learning — and the subsequent rehearsal of what we have learned — far more challenging is to disrupt the “autopilot” pattern (easy-to-use, unconscious single communication modalities) when we approach new material, when we seek to comprehend or master it, then note, recap, recall and explain what we have learned. By using multiple modalities during new learning, during rehearsal of that learning and during production of artifacts based on new learning, one does two things. First, one makes the processes involved in learning (and reinforcing and recalling learning) more cognitively challenging because it forces the brain to juggle or switch among modalities. Second, it adds robustness to the retrieval of of new learning because that new pattern is inscribed in the brain in multiple interconnected but slightly different ways. Rehearsing what one has learned via one of the modalities helps support and reinforce its recall via other modalities. That is, using multiple modalities can mutually reinforce and strengthen multiple paths for retrieving what was learned. To be clear, there is significant evidence in cognitive scientists’ emerging views of learning that support the use of multiple modalities in learning.

(Side Note about “Learning Styles”)

Much has been made in the past of so-called “learning styles.” Cognitive science research does not support this framework for learning because there is no evidence to suggest that it is effective to customize learning activities or modalities to fit a particular student’s ostensible “learning style”. While it is true that some students may have a preference for hearing information and others for reading it, research strongly suggests that, absent particular disabilities or neurocognitive anomalies, students can learn equally well using a variety of modalities. Rather than focusing on a single learning modality in the case of any particular student, it is better to encourage use of a variety of modalities, preferably in multimodal ways, to increase cognitive load and to necessitate greater care and attention in attending to the learning tasks. What is more, learning is more effective when learning and recall are meaningful. For that reason, I would suggest, first, that it is appropriate to allow students to choose their own — meaningful — modalities for learning, for memory reinforcement and for communication of learning and, second, that multimodal learning practices are effective in ways that mono-modal learning-style-focused learning practices are not (see Brown, Roediger & McDaniel 2014, particularly chapter 6, “Get Beyond Learning Styles”).

Multimodal Learning (theory, sources, research, further reading)

All of this is intended primarily to offer support for Cope’s and Kalantzis’ model that focuses on the virtues of multimodal learning, while adding some nuance to the notion, drawing on my own (admittedly sketchy) understanding of cognitive science perspectives on learning. I also note that many other learning theorists and education visionaries have their own takes on multimodal learning. Here, I’ll add some links to a few additional resources:

Click the following link for an explanation of “multimodal literacy” on a WordPress site for the “Multimodal Literacy Learning Community” created by Victor Lim Fei, Deputy Director, Technologies for Learning, Educational Technology Division, Ministry of Education, Singapore:

A 2007 paper on computer-mediated multimodality in the classroom:

An entry on Guenther Kress, another proponent of taking multimodalities into account in pedagogy and learning theories:

A downloadable dissertation by Kevin R. Cassell, “A Phenomenology of Mimetic Learning and Multimodal Cognition” (2014):

A recent workshop position paper by Anne Marie Piper about the pedagogical affordances of multimodal tabletop displays: interfaces workshop_ampiper.pdf

And, finally, of course, the wiki page on “multimodality”:

My apologies for the length of this update. I guess my cognitive enthusiasm was more powerful than my sense of proportion and restraint! -Robert


Arola, Kristin L., Ball, Cheryl E., & Sheppard, Jennifer. (2014). Writer/Designer: A Guide to Making Multimodal Projects. Boston, MA: Bedford/St. Martin’s.

Brown, Peter C., Roediger, Henry L., & McDaniel, Mark A. (2014). Make It Stick: The Science of Successful Learning. Boston, MA: Belknap Press.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.