In the last decade, speech recognition has moved from the margins to the mainstream, with rapid improvements in accuracy and growing interest in its classroom potential. At the same time, national reading scores show that many students continue to struggle with fluency and comprehension, reinforcing the need for new tools that can provide timely, individualized reading support.
As Automated Speech Recognition (ASR) technology becomes more advanced and accessible, its role in literacy instruction is expanding quickly, offering new opportunities to enhance how students practice and develop core reading skills.
The Evolution of Automated Speech Recognition
The history of ASR dates back to 1952, when Bell Laboratories designed a system that could recognize digits 0 to 9 with an impressive 90 percent accuracy rate—but only when spoken by the device’s developer, HK Davis. The 1980s then saw several breakthroughs. Most importantly, a statistical method called the Hidden Markov model revolutionized language modeling by enabling the prediction of phonemes—the smallest units of sound that distinguish one word from another in a language, like the k in cat versus the b in bat. Recognizing phonemes is an essential building block in learning to read, helping learners connect spoken sounds to written letters.
Over the next several decades, ASR experienced steady advancement, supporting faster response times and more accurate recognition of natural speech patterns like pauses, accents, and informal phrasing. By the 1990s, ASR began to find its way into educational applications, with the launch of Carnegie Mellon University’s Project LISTEN, an effort to develop a computerized tutor that can listen to children read.
This article is based on two workshops from the Rethinking Reading: AI for Literacy Achievement workshop series, a set of webinars on education AI applications organized in collaboration with InnovateUS and the Burnes Center for Social Change at Northeastern University.
Interested in Learning More?
- Check out the full workshops here: Reading Out Loud, Growing Strong: AI Tools for Fluency Development and Smart Literacy Instruction: Speech Recognition to Teach Reading
Explore more topics through other webinars featured in the Rethinking Reading series.
The 2010s then saw innovators make significant strides in ASR development, thanks to greater computing power, the rapid rise of smartphones, and the emergence of prominent ed tech companies such as Amira Learning and SoapBox Labs. During this period, Google’s algorithms reached 95% English word accuracy, roughly on par with human performance. The addition of speaker recognition, which allows systems to differentiate between voices, was also a key milestone. Today, advanced AI is further boosting the accuracy, speed, and capabilities of ASR, expanding its practical applications across sectors. These improvements are now making ASR a practical and increasingly valuable tool in classrooms, where it can be used to support literacy instruction and fluency development.
AI Reading Assistants for Students
Nearly 60 percent of English and language arts teachers recently surveyed by The Learning Agency say they would consider using speech recognition tools to support literacy instruction, and another 25 percent are open to the idea. This growing interest underscores the potential of today’s AI reading assistants, which build on decades of ASR development to provide personalized, interactive support. These tools combine advanced speech recognition with research-based instructional methods to help students improve fluency, vocabulary, and comprehension through real-time feedback and engaging reading practice.
Amira Learning and LUCA.ai are two examples of tools that leverage ASR for reading support.
- Amira Learning is a reading assistant built on cognitive science and the Science of Reading framework, combining assessment, instruction, and tutoring. Its Assess, Instruct, Tutor (AIT) method starts with a 15–20-minute assessment that serves as a benchmark and progress monitor, followed by personalized instruction and real-time tutoring. Amira offers bilingual and Spanish literacy programs that use native-language support and cognates to strengthen English proficiency and fluency. Developed with researchers at Carnegie Mellon University and Columbia University, Amira has been shown in studies to improve student vocabulary almost as effectively as one-on-one tutoring.
- LUCA.ai is a digital reading tutor that combines immersive reading experiences with personalized AI-driven support, designed with input from researchers at the Yale Center for Dyslexia and Creativity, the Dyslexia Library, and Texas A&M University. Its tools draw on research emphasizing the importance of targeted interventions for learners with dyslexia, focusing on challenges like morphological awareness and letter-sound recognition. Features include StoryLab, which generates personalized stories based on each reader’s interests and level, and LUCAListens, which uses advanced speech recognition to deliver real-time feedback on difficult letter-sound pairs, supporting improved reading comprehension.
Additional tools include Scholastic’s Ready4Reading, a digital phonics system powered by SoapBox Labs; ClearFluency; Google’s Read Along; and Microsoft’s Reading Coach.
Nearly 60 percent of English and language arts teachers recently surveyed by The Learning Agency say they would consider using speech recognition tools to support literacy instruction, and another 25 percent are open to the idea. This growing interest underscores the potential of today’s AI reading assistants.
The Future for Voice Recognition Software in Classrooms
As ASR tools continue to be developed, several critical research questions are emerging around how to make speech recognition tools for literacy more inclusive and accurate.
Researchers are increasingly focused on making ASR tools more inclusive and effective, particularly by improving performance across dialects, accents, and languages. This remains a significant challenge: over 7,000 spoken languages exist worldwide, and English alone has more than 160 dialects. A 2020 study found that 66 percent of users reported accent- or dialect-related issues when using voice technology. These challenges are even more pronounced for young children, whose voices have greater acoustic variability. Literacy tools for young learners also require greater phoneme-level precision to deliver better feedback, as opposed to more commercially available tools that can generally function effectively for adults without that level of detail. To close these gaps, developers are working to optimize existing techniques, enhance usability, and ensure that tools meaningfully support literacy development across diverse linguistic backgrounds.
Inclusivity research is also expanding to cover children with learning and speech disabilities, who have faced significant barriers with ASR tools. Advances like voice profiling, which improve recognition of younger users’ speech patterns, mark real progress, but much of this work is still emerging. Ensuring ASR tools are effective for children with varied needs remains a key research priority as these tools become more embedded in classrooms.
Researchers are increasingly focused on making ASR tools more inclusive and effective, particularly by improving performance across dialects, accents, and languages. This remains a significant challenge: over 7,000 spoken languages exist worldwide, and English alone has more than 160 dialects.
These improvements must also address privacy and security. Voice recordings are biometric data, especially sensitive when it comes to children, and collecting the large datasets needed to train ASR tools raises ethical and legal challenges. For example, many parents are concerned about allowing voice collection at all or publishing recordings for open use. Balancing privacy protections with the heavy investment of time and funding needed to collect and store data is complex. Technical issues like background noise, regional jargon, and hardware limitations also affect performance, underscoring the ongoing need for human oversight and sustainable infrastructure.
Beyond the technical developments researchers are working toward, innovators are also considering questions around pedagogical applications of ASR. Questions around the ideal form and timing for feedback, as well as the types of interfaces that most effectively engage children, are equally important areas of continued exploration. For example, platforms like Amira Learning use animated storytelling interfaces that pause when a student struggles to pronounce a word to provide feedback and prompt them to try again.
At The Learning Agency, we are exploring ASR through research reviews, expert interviews, teacher surveys, and other publications focused on children’s ASR data and modeling. Through these efforts, we aim to increase understanding of both the challenges and opportunities in the field. Given our experience managing education-based data science challenges, we are also supporting the development of an upcoming competition focused on ASR for children.
This year, the Tools Competition, a Renaissance Philanthropy program operated by The Learning Agency, also featured a new Dataset Prize, supporting the open access release of innovative education datasets. This has the potential to fund the release of new ASR datasets, furthering research in the field.
Growing Number of Ways to Improve Reading Skills with ASR Technology
Speech recognition technology offers new ways to support literacy through instant feedback on students’ pronunciation, fluency, and comprehension. These tools help spot specific reading challenges early, making it easier for educators to provide focused support and monitor progress over time.
As the technology improves, ASR can continue to make personalized reading practice more accessible and support more students to overcome common barriers. By building core skills through frequent practice and clear feedback, ASR has the potential to be a strong ally for teachers and parents looking to help readers improve their reading confidence and ability.

L Burleigh
Data Analyst

Cas Burns
Warwick Consulting
Articles by guest or contributing authors do not necessarily reflect the views of The Learning Agency, our clients, or our funders.