Skip to main content Skip to secondary navigation

Join us in Academic Innovation for the Public Good

Register now for our online book conversation series with authors. Next event: May 15.

Farewell to blue books? Language Center shows benefits of digital assessments

Main content start

The ability to collect, store and analyze digital data from final assessments offers more than convenience. It can enable educators to measure what students are learning and to make needed improvements. 

Stanford Student take the end of the year Oral Proficiency Interview in the Language Laboratory ib May 31, 2022.
Stanford students take their end-of-year Oral Proficiency Interview for language courses on May 31, 2022.


By Jonathan Rabinovitz

Having just finished their final end-of-year assessments, students in Stanford’s language laboratory pressed a few keys on the Chromebooks — and voilà: Their answers were whisked through the ether to a server, where instructors could instantly begin to assess them.

Next door, just outside Bishop Auditorium, on that same day last May, students in another course dropped their blue books into numbered cardboard boxes as they left their exam. The blue books were then toted back to the department office, where section leaders had to go to pick them up.

“I’ll never understand it,” said Elizabeth Bernhardt, the John Roberts Hale Director of the Language Center and professor of German studies in the School of Humanities and Sciences. “There’s all this time spent keeping track of the blue books, keeping them secure, and then you give them back without any detailed record of what was in them.”

Bernhardt has been building an automated paperless testing regimen for the Stanford language programs for 27 years. Working with university staff experts in learning technologies, she used funds from her endowed chair as language center director to create a system, which she now hopes to share with others on campus.

Bernhardt’s approach offers much in the way of convenience, but the biggest advantage is the data that she and her colleagues can collect and analyze. By consistently evaluating a uniform data set year after year, they can measure how much students are learning in their language courses relative to previous classes, identify areas where courses could be improved, pinpoint students’ strengths and weaknesses, and then make changes. 

“This innovation shows the potential for Stanford to develop tech in support of teaching and learning across the disciplines,”  said Matthew Rascoff, vice provost for digital education. “While the entrepreneurial ed tech sector produces amazing products, it’s driven by commercial priorities. There is value in developing solutions in tandem with the needs of educators, and then scaling from within higher education.” 

Bernhardt’s approach has yielded strong results. “Data indicate that our programs are significantly ahead of the pace for language learning projected by the Foreign Service Institute,” she said. The FSI, a unit of the U.S. State Department, estimates 300 to 400 hours of instruction are needed to bring new students in the cognate languages (French, German and Spanish, among others) up to the standard known as “Intermediate Mid.” The Stanford courses do it in less than half that amount — 150 hours in an academic year. Stanford courses in such noncognate languages as Chinese, Russian and Arabic also employ the same faster pace to reach a comparable standard.

One shared standard

To be sure, such success can only be attributed in part to the technology that allows the tests to be tabulated and analyzed. To adopt such a system required broader changes in the language programs that Bernhardt inherited when she arrived at Stanford in fall 1995. She had a mandate to overhaul how languages were being taught and to make sure that all Stanford students would graduate with some fluency in a foreign language. 

But there were a variety of approaches in place, and various ideas about standards. It was not simply a matter of putting a new software system in place.

“Before Elizabeth arrived, languages were all taught in whatever way each literature department saw fit, and literature departments are not always apprised as to what is going on in second language acquisition — their focus is literature, culture, film,” said Alice Miano, lecturer and coordinator of the Spanish language program. “Elizabeth ushered in a common understanding of what second language acquisition is all about and brought that understanding across languages so that I can talk to my colleagues in the Chinese language program or in any language program at Stanford.” 

Language instructors at Stanford all share a common standard for assessing language proficiency. 

Under Bernhardt’s direction, the university’s language-instruction programs embraced the gold standard of language instruction, set by ACTFL (formerly known as American Council for the Teaching of Foreign Languages), emphasizing how to use a language in daily life over learning about its rules. To achieve this shared standard across all languages, Bernhardt had to bring instructors from the various literature departments under one umbrella. She then needed to encourage them all to undergo the rigorous certification process to do ACTFL assessments — the Oral Proficiency Interview (OPI) and the Writing Proficiency Test (WPT). Today, at least two of every three benefits-eligible language instructors at Stanford are certified testers on the oral and written assessments. “It is rare in the United States for institutions to even have a handful of instructors with such training,” Bernhardt said.

No other university has such a system in which hundreds of students from all the language programs have their digital assessment responses rigorously evaluated online by an in-house team using the ACTFL’s criteria.

“It definitely opened new roads to me which continue to transform my teaching,” said Lyris Wiedeman, a senior lecturer and director of the Portuguese language program. “Elizabeth brought a strong theoretical framework to Stanford when she started the language center, and the framework is kept alive in the multiple training opportunities that she gives her instructors, including a methodology class that she leads for all [graduate teaching assistants]. She continues to take steps to develop a strong community among the faculty. She never stops thinking about how the programs could be made better.”

Nina Yuhsun Lin, a lecturer in Chinese language, added: “I would not be where I was today with my teaching and the ACTFL if Elizabeth had not given us these opportunities.  I am truly very grateful.”

While the technology to administer the test, called ‘Blubook,’ is suitable for any subject, it has features specific to language assessment. 

Final language exams require headphones and microphones

On that day last May at the end of the spring quarter, when a cohort of students were taking their final assessments in the language lab, the room sounded like a 21st-century Tower of Babel, filled with the hum of several dozen voices speaking different languages, including Spanish, German, Chinese and Arabic. Students were equipped with headphones and microphones, and they all were engaged in the same Simulated Oral Proficiency Interview (SOPI), in which they heard recorded questions in whatever language they had been studying.

“Describe your dormitory room,” was a prompt delivered in multiple tongues. When the students answered aloud, their responses were recorded onto the Chromebooks that had been awaiting them at their desks. When the oral part of the test was done, they all moved on to a Writing Proficiency Test (WPT) with questions demanding a written response. Again, they had the same questions regardless of the language. There were no blue books, as they could type out their answers into the computers. Bernhardt points out that students are able to write more using a computer than with pen or pencil, making for a better assessment of what they have learned.

Takeshi Sengiku, assistant director of the digital learning lab, was in the control room monitoring the students taking the assessments, ready to troubleshoot any problems that might arise. A backup protocol was in place to store answers to the Chromebooks’ hard drive in the event there was a problem uploading them to the server. But it did not need to be used. Zero data was lost with 491 assessments, he said.

This year, for the first time, students did not have to come at the same time with all their classmates to take their assessments. Instead, students could choose from individual time slots because the exam questions were preloaded and could be easily retrieved at different times. Upon signing into the exam, students were assigned a desk, and the tech support team arranged to have the appropriate assessment delivered to them. “Students like the flexibility,” Bernhardt said, noting that it also meant that a class session did not need to be devoted to testing; this means more precious time for class instruction.

Bernhardt is continuing to innovate. She and her colleagues are looking at new ways to analyze data from the WPT, the writing test. And they also are working to make the assessments available at other times of year besides the end of the spring quarter.

She is pleased, though, with how it’s working and wonders why other departments have not followed suit. She speculates that instructors have gotten used to having students write out answers in blue books and to grading them by hand. “Maybe it’s the mindset of 'that’s the way we’ve always done it,’” she said. Of course, other departments may have unique challenges.  

She mentions, for instance, that different subjects may require different symbols, formats and formulas in written responses, but she is optimistic that such an obstacle could be overcome.  “If students can learn to type in Arabic or Russian, they can learn to enter scientific notation,” she said.

Published December 14, 2022

Follow Stanford Digital Education: Sign up for Stanford Digital Education's quarterly newsletter, New Lines, to learn about innovative research and teaching in digital spaces, as well as our team's initiatives. Subscribe to New Lines.