Skip to content
The Learning Agency
  • Home
  • About
    • About Us
    • Our Team
    • Our Openings
  • Our Work
    • Services
    • Case Studies
    • Competitions
      • Competition Overview
    • The Learning Exchange
    • Reports and Resources
    • Newsroom
  • The Cutting Ed
  • Home
  • About
    • About Us
    • Our Team
    • Our Openings
  • Our Work
    • Services
    • Case Studies
    • Competitions
      • Competition Overview
    • The Learning Exchange
    • Reports and Resources
    • Newsroom
  • The Cutting Ed
The Learning Agency
  • Home
  • About
    • About Us
    • Our Team
    • Our Openings
  • Our Work
    • Services
    • Case Studies
    • Competitions
      • Competition Overview
    • The Learning Exchange
    • Reports and Resources
    • Newsroom
  • The Cutting Ed
  • Home
  • About
    • About Us
    • Our Team
    • Our Openings
  • Our Work
    • Services
    • Case Studies
    • Competitions
      • Competition Overview
    • The Learning Exchange
    • Reports and Resources
    • Newsroom
  • The Cutting Ed

Improving Student Writing Through AI: The Impact of the PERSUADE Dataset

The Cutting Ed
  • June 23, 2025
Kennedy Smith

Like many good ideas, the Feedback Prize — a data science initiative aimed at improving how educators provide writing support — started over a cup of coffee. 

It was 2019, and Kumar Garg, who was then working for Eric Schmidt, suggested that someone should pull together a large corpus of writing essays for AI development. Kumar’s insight was a simple and prescient one: there were far too few AI-ready datasets for key domains of K-12 education. 

It’s been more than five years since that coffee in the lobby of a Washington hotel, and since then, the impact of the Feedback Prize has been sprawling. The resulting datasets have been downloaded thousands of times and cited in almost one hundred research studies. Companies, including Google, have also used the dataset for their algorithm development, and today the corpus of essays is one of the largest and influential in the field.  

Why was such an effort necessary? According to the National Assessment of Educational Progress (NAEP), less than a third of high school seniors demonstrate proficiency in writing. Among Black and Hispanic students, that number drops to under 15 percent.

The problem isn’t just a lack of writing instruction. It’s also a lack of timely, meaningful feedback. Writing improvement requires practice and revision, but teachers are often stretched too thin to provide the individualized feedback students need to grow.

At the time, assisted writing feedback tools (AWFTs), including platforms like ETS Criterion and Revision Assistant, offered some promise. These tools could automate parts of the writing feedback process by identifying strengths, weaknesses, and areas for improvement and provide some level of assessment to the student for the revision process. 

However, the effectiveness of these tools was limited by three major issues: surface-level writing features, high cost, and low accuracy. To improve them, researchers and developers need access to robust, high-quality datasets that reflect real student writing and meaningful writing elements.

This work was ahead of its time, starting long before ChatGPT was released. There were few headlines about how AI was going to revolutionize the world. There was little talk about how large language models might improve education, or really anything. 

The final PERSUADE corpus consists of over 25,000 argumentative essays written by students in grades 6 to 12. These essays represent a diverse range of student voices from across the United States, with detailed metadata including gender, race/ethnicity, and grade level.

Kumar saw the value of datasets early – and remains a key advisor – and eventually the work turned into a collaboration between The Learning Agency Lab, Georgia State University, and Vanderbilt University to launch a groundbreaking effort to improve how educators provide writing support via artificial intelligence (AI). 

The effort received support from philanthropic leaders like Eric Schmidt, the Gates Foundation, and the Chan Zuckerberg Initiative, and the dataset known as Persuasive Essays for Rating, Selecting, and Understanding Argumentative and Discourse Elements (PERSUADE) was a direct result of that early mission.

What Is The PERSUADE Database?

Drawn from a massive collection of approximately 600,000 student essays provided by eight different organizations and states, PERSUADE was carefully curated to support the development of machine learning models that could evaluate and provide feedback on argumentative writing. Its focus was not on surface-level features of writing, but on deeper rhetorical features including the use and quality of claims, evidence, and data. It became one of two central datasets to emerge from the Feedback Prize initiative, alongside ELL Insight, Proficiency and Skills Evaluation corpus (ELLIPSE). 

Scott Crossley, now a professor in the Department of Psychology and Human Development and the Data Science Institute at Vanderbilt University, played a crucial role in developing, planning, and preparing the dataset for competition. His expertise continues to be instrumental in the development of educational datasets today.

The final PERSUADE corpus consists of over 25,000 argumentative essays written by students in grades 6 to 12. These essays represent a diverse range of student voices from across the United States, with detailed metadata including gender, race/ethnicity, and grade level. A portion of the corpus includes data on student eligibility for federal assistance programs such as free or reduced price school lunch, Temporary Assistance for Needy Families (TANF), and the Supplemental Nutrition Assistance Programs (SNAP), which was broadly defined during the Feedback Prize initiative as economic disadvantage. Additionally, the dataset includes information on students’ ELL status and disability status, helping to ensure a more inclusive and representative portrait of student writing.

In a 2024 study, Matthew Johnson & Mo Zhang at Educational Testing Service (ETS) Research Institute, examined AI fairness, explainability, and accuracy in essay scoring using ChatGPT-4o. Johnson and Zhang findings suggested that ChatGPT-4o scored essays poorly and showed potential bias, noting that the chatbot might infer racial or ethnic background when rating essays.

How Was PERSUADE Annotated?

Each essay was annotated by 24 experienced writing teachers who taught in diverse school communities. These educators labeled seven critical elements commonly found in argumentative writing:

  1. Lead. An introduction begins with a statistic, a quotation, a description, or some other device to grab the reader’s attention and point toward the thesis.
  2. Position. An opinion or conclusion on the main question.
  3. Claim. A claim that supports the position.
  4. Counterclaim. A claim that refutes another claim or gives an opposing reason to the position.
  5. Rebuttal. A claim that refutes a counterclaim.
  6. Evidence. Ideas or examples that support claims, counterclaims, rebuttals, or the position.
  7. Concluding Statement. A concluding statement that restates the position and claims.

Each essay also received a holistic score for overall quality and element-level ratings for effectiveness.

Researchers Using PERSUADE

Today, researchers are leveraging the PERSUADE dataset to explore a variety of questions at the intersection of education and AI.

In a 2024 study, Matthew Johnson & Mo Zhang at Educational Testing Service (ETS) Research Institute, examined AI fairness, explainability, and accuracy in essay scoring using ChatGPT-4o. Johnson and Zhang findings suggested that ChatGPT-4o scored essays poorly and showed potential bias, noting that the chatbot might infer racial or ethnic background when rating essays. While not conclusive, this research underscores the need for ongoing examination of fairness in AI-assisted writing evaluation.

In another study in 2025. Ding et al., used PERSUADE to evaluate an interactive training system called Feat-Writing, designed to improve students’ argumentative writing skills. The system guides students through exercises that build their understanding of the components of argumentative writing. The evaluation found that Feat-Writing had a positive impact on student writing.

As the Feedback Prize project is nearly five years old, its value has become unmistakable. From its origin in a hotel lobby to its role in helping build better and fairer essay scoring tools, PERSUADE has elevated how researchers and educators think about student writing.

Marsour et al. (2025) incorporated PERSUADE along with six other educational writing datasets to investigate how AI-generated texts can be altered to evade detection. They found that many existing detectors struggle to identify modified AI-generated texts, but developed and tested a robust model that remained effective even against attempts to bypass it.

Beyond individual studies, PERSUADE is also being used to develop other training resources. Kaggle Grandmasters Darek Kłeczek and Nicholas Broad integrated PERSUADE into large, publicly available datasets used to train models that detect AI-generated writing in educational contexts. To support wider access and innovation, PERSUADE is available for download through the Learning Exchange, a curated repository of high-quality, open-source educational datasets designed to promote research and development in education and natural language processing.

What’s Next For PERSUADE?

Now, years later, PERSUADE continues to shape the future of writing feedback tools, serving as a resource for researchers, developers, and educators alike. However, the published studies and accompanying datasets only represent the beginning of PERSUADE’s impact.

Looking forward, there is enormous potential to build on this foundation. Future versions of the PERSUADE dataset could include annotations for dialect, language complexity, and readability, helping AI writing tools recognize and adapt to linguistic variations that reflect different cultural or socioeconomic backgrounds. 

Additionally, annotating for cultural references and framing could help systems better understand how students express themselves based on their lived experiences, rather than penalizing them for non-standard language use. Ultimately, this work would allow for an understanding of the full context behind a student’s writing and designing tools that truly support their learning.

As the Feedback Prize project is nearly five years old, its value has become unmistakable. From its origin in a hotel lobby to its role in helping build better and fairer essay scoring tools, PERSUADE has elevated how researchers and educators think about student writing.

The work on helping students become better writers through the use of AWFT is not finished, but thanks to PERSUADE, the path forward is clearer and the future of writing tools is brighter.

Kennedy Smith

Program Associate

Articles by guest or contributing authors do not necessarily reflect the views of The Learning Agency, our clients, or our funders.

Twitter Linkedin
Previous Post
Next Post

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Contact Us

General Inquiries

info@the-learning-agency.com

Media Inquiries

press@the-learning-agency.com

X-twitter Linkedin

Mailing address

The Learning Agency

700 12th St N.W

Suite 700 PMB 93369

Washington, DC 20002

Stay up-to-date by signing up for our weekly newsletter

© Copyright 2025. The Learning Agency. All Rights Reserved | Privacy Policy

Stay up-to-date by signing up for our weekly newsletter