Skip to content
The Learning Agency
  • Home
  • About
    • About Us
    • Our Team
    • Our Openings
  • Our Work
    • Services
    • Case Studies
    • Competitions
      • RATER Competition
    • Reports & Resources
    • Newsroom
  • The Cutting Ed
  • Learning Engineering Hub
  • Home
  • About
    • About Us
    • Our Team
    • Our Openings
  • Our Work
    • Services
    • Case Studies
    • Competitions
      • RATER Competition
    • Reports & Resources
    • Newsroom
  • The Cutting Ed
  • Learning Engineering Hub
  • Overview
  • Overview
TLA_sis-concern_logo
The Learning Agency
TLA_sis-concern_logo
  • Overview
  • Overview
The Learning Agency
TLA_sis-concern_logo
  • Home
  • About
    • About Us
    • Our Team
    • Our Openings
  • Our Work
    • Services
    • Case Studies
    • Competitions
      • RATER Competition
    • Reports & Resources
    • Newsroom
  • The Cutting Ed
  • Learning Engineering Hub
  • Home
  • About
    • About Us
    • Our Team
    • Our Openings
  • Our Work
    • Services
    • Case Studies
    • Competitions
      • RATER Competition
    • Reports & Resources
    • Newsroom
  • The Cutting Ed
  • Learning Engineering Hub
Back to All News & Insights
Back to Archives
  • Insights

Now That ChatGPT’s Been Introduced, It’s Time To Fine Tune It

It’s been a year since ChatGPT arrived – landing in classrooms and causing leaders to reimagine education. Of course, there was AI before ChatGPT, but much like there was basketball before Michael Jordan, the game hasn’t been the same since.

Following its initial rollout period, researchers from Stanford, Worcester Polytechnic Institute, and the University of Toronto among others have been looking at AI models in education. In three recently-published studies they outline what works when it comes to ChatGPT – and what does not.

Notably, the new studies suggest that ChatGPT needs to be trained on education-specific data and targeted to specific contexts if it is to truly improve learning outcomes.

To be sure, ChatGPT can be a boon for students and teachers alike, freeing them from mundane tasks and alleviating inefficiencies. It may potentially herald breakthrough solutions in scalable learning and help mitigate inequities, but it could also exacerbate them. The new research helps answer these questions, digging closely into the implications and execution of ChatGPT for learning technology.

New Kid On the Block

When AI-powered large language model (LLM) ChatGPT hit the scene in late 2022 it elicited everything from fear to a fervor of excitement. Some educators continue to welcome its potential to help streamline and scale lessons, bear the burden of cumbersome administrative tasks, and even engage students with the allure of new tech.

Other educators approached the introduction of ChatGPT with more caution – expressing concerns that the technology could lead to cheating, compromise critical thinking skills, push out erroneous information or even replace the work that teachers do.

More recently researchers have set out to study the implications of LLMs and AI in a variety of scenarios, subjects, and skill sets with mixed results.

The first study was released last year from a team at Worcester Polytechnic Institute, titled Comparing Different Approaches to Generating Mathematics Explanations Using Large Language Models. It looked at whether LLMs could be used to quickly produce math problem explanations in order to help expedite timelines for adding new math lessons to online learning platforms.

Specifically, the team explored the: “possibility of large language models, specifically GPT-3, to write explanations for middle-school mathematics problems, with the goal of eventually using this process to rapidly generate explanations for the mathematics problems of new curricula as they emerge, shortening the time to integrate new curricula into online learning platforms.”

So what happened? The team took two approaches. The first attempted to “summarize the salient advice in tutoring chat logs between students and live tutors. The second approach attempted to generate explanations using few-shot learning from explanations written by teachers for similar mathematics problems.”

Ultimately, teachers outperformed LLMs. The study’s authors concluded that: “In the future more powerful large language models may be employed, and GPT-3 may still be effective as a tool to augment teachers’ process for writing explanations, rather than as a tool to replace them.”

Education leaders are studying to see if ChatGPT can help by becoming a sort of automated teacher coach – essentially providing cost-effective, scalable support for educators via generative AI.

Meanwhile, researchers at Stanford explored another aspect of LLMs – namely their ability to become good coaches or trainers for teachers. One of the challenges facing teachers is a lack of high-quality coaching – a fundamental component of teacher training and something that requires classroom observation and knowledgeable feedback.

But amid teacher shortages and resource challenges, a majority of teachers don’t have access to such coaching and training expertise. Education leaders are studying to see if ChatGPT can help by becoming a sort of automated teacher coach – essentially providing cost-effective, scalable support for educators via generative AI.

The Stanford research team conducted “three teacher coaching tasks for generative AI: (A) scoring transcript segments based on classroom observation instruments, (B) identifying highlights and missed opportunities for good instructional strategies, and (C) providing actionable suggestions for eliciting more student reasoning.” They recruited expert math teachers to evaluate the zero-shot performance of ChatGPT on each of the three tasks for elementary math class transcripts.

The “Is ChatGPT a Good Teacher Coach?” study found that while the potential is there and ChatGPT offered relevant suggestions, those suggestions were neither novel nor particularly insightful. In fact, live teachers reached the same conclusions earlier and better and 82% of the model’s suggestions pointed to places where teachers already implemented the generative AI’s suggestion.

In short, much like the previously discussed middle-school math study, the researchers found that there is significant work to be done to make ChatGPT work in the envisioned capacity.

Finally, last month, researchers at the University of Toronto and Microsoft published the results of their work investigating how exposure to LLM-based explanations affects learning. The study, “Math Education with Large Language Models: Peril or Promise?” included 1,200 participants and sought to capture insights into how large language models might serve as scalable, personalized tutors for students.

In the experiment’s learning phase, participants received practice problems. (The study’s participants were comprised of individuals – including undergraduate students – via Amazon’s Mechanical Turk program.) Two key factors were manipulated and assessed in the study’s design: 1. Whether participants were required to attempt a math problem before or after seeing the correct answer. 2. Whether participants were shown only the answer or were also exposed to an LLM-generated explanation of the answer. (All participants were later tested on new test questions to assess how well they had learned the underlying concepts.)

Early research indicates that ChatGPT may not be capable of offering large scale, actionable insights for teachers – yet. Nor can it equitably and honestly fast track online learning programs that help address complex educational challenges – yet. ... But the potential is there with more fine tuning, more research, and more time.

This study was more positive than the previous two, and overall, the study’s authors concluded, “We found that LLM-based explanations positively impacted learning relative to seeing only correct answers.”

The authors were still cautious and the benefits were best for those who attempted to solve problems on their own first, but positive trends still held true even for those who were exposed to LLM answers before they attempted to solve the practice problem on their own.

The “exposure to LLM explanations increased the amount people felt they learned and decreased the perceived difficulty of the test problems,” the study’s authors found.

LLMs And The Power of Yet

So what does this all mean? The studies suggest two things. One ChatGPT can be better than nothing, as was the case in the Toronto study. But for ChatGPT to do teacher-level work, it will need to be trained on education-specific datasets.

More broadly, the studies suggest that the excitement over LLMs is warranted, at least for now. Think of the motivational wall poster that many students encounter in school, advocating “The Power of Yet.” Such is the case with ChatGPT and LLMs.

Early research indicates that ChatGPT may not be capable of offering large scale, actionable insights for teachers – yet. Nor can it equitably and honestly fast track online learning programs that help address complex educational challenges – yet. And it may not be capable of providing out-of-the-box teacher training solutions – yet. But the potential is there with more fine tuning, more research, and more time.

In particular, the field needs more datasets to train ChatGPT so that it can perform better. Such datasets will also be key to figuring out issues of equity. Do LLMs perform better for some students than others? Do LLMs have issues of bias?

Just like a sports team working out the kinks and finding its rhythm with a key, new player — the initial rollout of ChatGPt and generative AI may be rocky but big wins are possible, even if they aren’t quite here… yet.

This article first appeared on Forbes.com

Ulrich Boser

Ulrich Boser

CEO
Twitter Linkedin
Previous Post
Next Post

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

More Insights

Must-See Sessions at SXSW EDU 2025

SXSW EDU is right around the corner. Be sure to add these great panels to your conference schedule. They’ll cover hot issues such as rethinking traditional funding models to fostering inclusion in ed tech.

Read More
IES data sets
Five Things to Know About Working with IES Data

While the common focus of the NAEP is on student performance in core subjects, the agency holds a wealth of data about school status and teacher experience.

Read More
A.I. In Schools: A Reporter’s Tip Sheet for the New School Year

A.I. and education will be hotly debated this school year. What does A.I. in the classroom look like, beyond bot-generated worksheets and quizzes, and how should reporters to cover it?

Read More

Contact Us

General Inquiries

info@the-learning-agency.com

Media Inquiries

press@the-learning-agency.com

Facebook Twitter Linkedin Youtube

Mailing address

The Learning Agency

700 12th St N.W

Suite 700 PMB 93369

Washington, DC 20002

Stay up-to-date by signing up for our weekly newsletter

© Copyright 2025. The Learning Agency. All Rights Reserved | Privacy Policy

Stay up-to-date by signing up for our weekly newsletter