Prototyping language learning in the metaverse

Using Virbela to imagine the future of language learning

12 min readMay 21, 2022

Talented researchers have spent their lives studying how humans learn languages. There’s an entire field of study dedicated to second language acquisition (SLA).

Yet successful language learning doesn’t seem to be happening for many students in America’s K-12 schools. The sad truth is that after years of formal second language instruction, students often leave with a limited ability to apply the language in context.

If you’ve spent any time in a middle school language classroom recently, you may have seen some of the reasons why this can be especially difficult in that context.

The problem with SLA in middle school

One issue we’ve seen in middle school language classrooms is the degree of psychological safety most students feel. Middle school students are often nervous about speaking up in class. This is a problem because language learning is an inherently risky business — you don’t know if you know how to say something until you’ve tried. Ideally, classroom cultures embrace failure as part of the learning process but this kind of culture is difficult to create and requires continual support to maintain.

The second issue we’ve seen in middle school language classrooms is the lack of motivation relating to short-term goals on the journey to language mastery. Most students feel some level of motivation to master a language, but in terms of doing the assignment this week or preparing for the quiz on Friday, many students feel less motivated.

This lack of short-term motivation is illustrated by an episode in an ESL class one of us taught as a substitute teacher in Boston Public Schools recently. As class started, a student begged us for permission to continue playing a game on his smartphone because he was about to reach a new level. We can’t help contrasting his motivation to engage with this level of his game with his motivation to engage in his classroom work. In the game he was in charge, moving towards attainable goals at his own pace; he was receiving feedback on his performance and recognition for achievements as he moved towards mastery. In his classroom task — a worksheet on a news article his teacher had pre-loaded for him on Google Classroom — he was not asked to show leadership or make creative decisions; the worksheet was the same one all his peers were doing, so it was unlikely to be the right difficulty level for him. He would probably receive some feedback, but it would likely be divorced from any sort of comment about his progress towards mastery. It would probably be difficult for him to discern the incremental progress he was making as he attended class and participated in activities or filled out worksheets from week to week.

What if we could create an educational experience that put students in the driver’s seat? What if we could give students authentic opportunities to demonstrate their learning in a safe space?

The idea: SpyTalk, a language learning game for the 21st century

As part of Ed Tech Advanced Design Studio, taught by Louisa Rosenheck at HGSE, we were asked to prototype an educational technology designed to address a specific educational problem. Focusing on the problem described above, we iteratively developed SpyTalk, an immersive language game designed to invite middle school students to take risks and engage in verbal activities.

The premise of SpyTalk is simple: If you were suddenly dropped into a foreign country with an important mission to accomplish, you would probably learn to say whatever was needed in that foreign language in order to fulfill your mission. You’ve seen this happen in movies and perhaps you’ve experienced it yourself such as when you memorized how to order a meal in Mexico or learned a phrase to express gratitude to your host in Japan.

SpyTalk is designed to accompany traditional classroom instruction or tutoring. We imagine teachers using it as a once-weekly learning performance in which the student can demonstrate what they’ve learned during the week. It could also be used in private tutoring or other learning settings.

What could this look like?

Let’s say you’re a 14-year-old student in Spain learning English. Instead of a quiz on Friday, your teacher asks you to find a private space in the classroom, put on headphones, and open a game on your laptop. As the game opens, it explains that you are a secret agent assigned to hack into the computer of an evil supervillain. Unfortunately, you can only access his computer from his personal office — so you’ll need to impersonate someone he knows and travel to the tiny island where he lives. You’re also told that the other inhabitants of the island know him pretty well, so you should talk to them to get the information you’ll need to guess his laptop’s security questions.

Next thing you know you’ve left your classroom behind, dressed up your digital avatar, and you’re walking around a virtual world, trying not to stir up any suspicion while gathering clues in conversations with the locals. You’ve only got a limited amount of time before your mission is over, so you need to jump into those conversations and then sneak into that office ASAP. And by the way, make sure you don’t get caught by his security team; they’re a rough lot and rumor says that he has a secret prison on the island.

Prototyping with Virbela

We wanted to test whether this experience would be as conducive to language learning as we imagined it would be but didn’t have the time to build a rich virtual world within the time constrains of our semester. Fortunately, we were able to use Virbela to prototype the experience we wanted to provide. Virbela is an engaging virtual world designed for a host of different purposes that we had previously used with HGSE professor, Chris Dede. Using Virbela allowed us to focus on designing the language learning experience instead of worrying about the tech side of the equation.

Designing level one

We consulted with an ESL teacher in order to design our first level. She suggested we focus the content around simple past tense.

Working collaboratively we wrote scripts for five characters who the students would interact with. In each interaction, students would be asked questions that would encourage them to use past tense in a natural way to further their mission.

Providing chat support

In addition to the verbal activities, we wanted each student to have support from someone on his or her team, someone who could provide hints or support for the learning journey (like the support person talking inside of the earpiece in spy films).

How smart should our characters be?

Our team discussed whether the characters in the game should act like NPCs (non-player characters) or human actors capable of improvisation. For our first playtest, we decided to allow improvisation from some characters but to confine other characters to the branching scripts or dialogue models of NPCs in video games. Our hope was to understand the pros and cons of improvisation for students by allowing them to interact with both kinds of virtual characters.

The playtest:

We recruited another friend as a fifth actor since our group consisted of only four members. With all characters represented, we began planning our playtest.

Our first playtester was a 16-year-old student in Tokyo.

After the student joined, she was taken through some onboarding slides by her teacher to understand the spy game’s scenario.

Sample of onboarding slides. Similar slides were shown partially in English and partially in Japanese.

Then she customized her avatar and got ready to explore.

Once onsite, her support team member, Alfred showed her how to move around the world and then told her he would disappear but always be available via chat.

Our student was now alone. She had her spy panel — an HTML webpage that we built — pulled up on her phone so that she could see which clues she was looking for while navigating around the world. Her task was to find answers to all five clues and then sneak into John Sherman’s office to hack into his local network.

The student wasn’t alone for long before Bill Guinn approached and introduced himself: “Hello there. I thought I already met everyone on this island, but you are a new face! My name is Bill Guinn, I am visiting my son who lives on this island…Tell me about yourself, why are you here on the island?” Bill was a pleasant older gentleman who helped her ease into the game with some small talk.

Next, the student was approached by Alli Appleton, the personal secretary of John Sherman. Alli reveals that she’s known John since 2nd grade, at which time the student can ask for more details. (The student knows from her panel that the name of John Sherman’s class pet in 2nd grade is one of the hints needed to break into his system.)

A bit more stress is added to the scenario when John Sherman’s security guards approach. Their questioning amplifies the tension of the game and its narrative as well as encourages the user’s language agility. The user has the autonomy to stay and work things out with security, get help from Alfred (spy support), or escape the situation. The security guards, similar to other characters, have information about John Sherman and his passcodes, and it is up to the user to glean this information using their super spy skills (asking questions and listening closely!).

After gathering all the clues, including some non-conversational clues (i.e. the number of chairs in the operations room), the student had to find a way to break into John Sherman’s office. Initially, she wasn’t sure where the office is, but its location is dropped through hints in other conversations.

As time runs out, the student runs over to John’s office and inputs all the clues into her panel. The game is over!

Feedback from playtest

We were thrilled with the level of positivity our first playtester expressed: “It was the best English learning experience that I have had this year.” Her teacher interviewed her and translated her comments for us: “I haven’t used English this much in a long time! Usually, I hold back from speaking English at school because I don’t want to stand out.” The game, she said, helped her feel OK using imperfect English. The characters would wait for her to have time to think and form sentences — which is normally quite nerve-wracking in real life. This helped her open up and speak English without even thinking about it: “I didn’t even realize that I was using English so much.”

She also shared that the setting felt engaging and immersive and added, “The spy mission of the game made it extremely engaging. I loved how I was supposed to find answers to these clues. It pushed me to talk to the characters.”

The student’s teacher, who observed her playing the game, commented:

Free conversation is often difficult for ELLs, especially when they are not so extroverted. It is also difficult to feel fully engaged in free conversation because [these conversations] can seem to lack purpose other than developing their English skills. This game provided the sense of purposefulness, which led to my student being engaged and motivated.

In this and other small experiments, participants expressed interest in playing again — which brings us to the major problem with our current approach.

Moving towards scalability: SpyTalk in learning communities

In its current design, SpyTalk is fantastically engaging, but it requires 3–5 adults to deliver. These adult actors must learn a script and background details about their character and then spend 20–30 minutes with each student participant.

The next phase in our project seeks to answer the question — how could we design this game to be scaled to a larger audience?

One concept would be to replace human actors with NPCs (non-player characters). This approach would make the game extremely scalable. In our experiments, if a student knew the vocabulary well, the student had no trouble interacting with characters played by the NPC-like actors (characters using scripted dialogue models). However, if a student struggled with vocabulary or tried to ask a question different from what we had assumed they would ask, we were in trouble. You’ve certainly experienced the same phenomenon when Siri politely tells you, “I can’t help you with that right now.” In these moments, when students were confused or veered from the script, actor improvisations were extremely useful and made the game authentic and real. Perhaps most importantly from an educational perspective, improvisation allowed the game to be customized to a level of difficulty the student could handle with characters providing more or fewer hints as needed.

There’s a temptation to wave one’s hands and say, “Well, AI can’t do this kind of thing yet, but it’s only a matter of time.” But after consulting with experts on NLP, our understanding is that it’s a very long time before the most advanced AI will be able to have a free and engaging conversation and understand students’ unexpected questions and comments.

There’s always help at Hogwarts

Initially, we had thought of the need for actors in our proposed game as a serious downside, but what if we reframed it as an opportunity?

In the Harry Potter series, Harry and his friends are upset at the rote, lifeless pedagogy mandated by Prof. Umbridge, their Defense Against the Dark Arts teacher during their fifth year of study. Hermione Granger hatches a genius plan — they can teach each other and learn by doing in a secret place on campus. They form a learning club called the DA. While the DA may seem like a revolutionary approach to learning, in reality, this type of mixed-age group learning-by-doing activity is extremely common. Think football, tennis, track and field, theater, chess, martial arts, debate, and most other pursuits where practice matters. Students of various skill levels come together to engage in playing the real game. This is something Jal Mehta and Sarah Fine have written about in their book on deeper learning.

Towards a learning community

What if we built our game in such a way that it gave teachers the tools they need to cultivate a learning community in their schools where older students served as actors to help younger students?

Imagine if the actor’s side of this game was something like an Xbox sports game (think Madden or FIFA) where the actor could easily toggle between characters. When the actor is not using a character, the character performs according to its assigned dialogue model as an NPC. But whenever the game or actor senses that a student needs a more customized approach, the actor steps in and takes over the character. With advances in machine learning now being used by companies like Descript, it would be possible for an actor to type in sentences and have them spoken out in the voice of whichever voice actor initially recorded a character’s NPC dialogue.

So older students, parents, volunteers, or people who are wanting to make a buck in the metaverse, could potentially come together to create engaging language simulation experiences for new learners. A company that is already doing something similar today is Mursion, albeit in a more technical and professional setting. Mursion’s platform allows actors to use keyboard shortcuts to toggle between multiple characters so that teachers can practice group teaching or business people can practice negotiating with two clients at once. Our proposal would take inspiration from what Mursion is doing but use this type of performance simulation technology in the language learning context.

By adding in automated characters and story-based scaffolding for actors, we hope to create a game that gives birth to a community of language learners — actors and learners — engaging in puzzle-based spy games and achieving proficiency through situated practice.

If anyone is interested in more information about what we learned from this experience, please contact us — we’d love to talk more.

Article compiled by Allison Williams and Fenton Hughes

Project team: Allison Williams, Ethan Westfall, Fenton Hughes, Przemek Stolarski

Extended team: Gleb Lantsman, Sara Inoue