In scientific circles, the Erdős number is well known as a measure of academic innovation and collaboration The number measures how closely a researcher is connected to the prolific Hungarian mathematician Paul Erdős through co-authored papers.
Erdős published or co-authored some 1,500 papers and collaborated with around 500 co-authors during his lifetime. He’s celebrated as being among the most prolific mathematicians and recognized for his unrelenting dedication to research, described as showing up unannounced to work with peers and declaring, “My brain is open.”
In a recent milestone accomplishment, however, Ryan Baker, Director of the Center for Learning Analytics at Pennsylvania’s Graduate School of Education, has beaten Erdős’s number of co-authors, recently reaching 517 co-authors and achieving more than 30,000 citations—a remarkable feat, especially for a mid-career researcher in education.
Roadmap to A Milestone Achievement
While surpassing Erdős was never the goal, Baker—a data enthusiast to the core—has taken the time to trace his publication and research engagement trends throughout his career. Several factors have helped him attain such a high number of papers—from teaching his first Massive Open Online Course (MOOC), which caused a spike in citations, to shifting away from publishing shorter conference papers that are often less likely to be read.
In the spirit of Erdős’s full immersion in his research, Baker too has fully committed himself to the field of learning analytics and the examination of how data can be leveraged to enhance online learning—even lending his own personal time-tracking data to other scholars. With his interest in exploring how students engage in computer lessons, a bulk of Baker’s research and work has explored the impact of MOOCs on learning. Notably, his own data from teaching the MOOC Big Data and Education at Teachers College, Columbia University and the University of Pennsylvania—which he taught for over 10 years—has been featured in studies like that of Fiona Hollands and Devayani Tirthali, which examined how education institutions leverage data collected through MOOCs.
Just as Erdős found it difficult to anchor himself to one place, traveling often to connect with other mathematicians, Baker too has found it inconceivable to glue himself to just one area of focus. Baker has written on everything from boredom to algorithmic bias. “One of the core things about me is that I don’t have one agenda,” Baker says. “I find myself unable to, and maybe that’s why I have so many co-authors and large numbers of papers.”
In the research questions he asks, tapping into the unknown is a passion of Baker’s. He shares that he’s also aimed to steer away from zeroing in on areas that might be perceived as easier to write on—which makes his achievement all the more astounding. “I think oftentimes what’s hard is where it’s most exciting,” he notes. “I feel like there are so many exciting things to look at. It’s all in the margins. The best thing to do is avoid the stuff that everybody is studying.”
“I think oftentimes what’s hard is where it’s most exciting. I feel like there are so many exciting things to look at. It’s all in the margins. The best thing to do is avoid the stuff that everybody is studying.”
In his work on algorithmic bias, for example, he’s been particularly interested in understanding understudied groups of learners and contexts. This has included studies, such as this 2022 paper co-authored with Aaron Hawn, which identified communities and environments that have received less attention in research, including indigenous learners, countries outside of the US, and learning frameworks like MOOCs.
Much like Erdős, who is said to have worked in more than 25 countries, Baker hasn’t limited himself to national boundaries either—working as a research fellow in the United Kingdom, teaching in Brazil, and cultivating research partnerships outside of the US. Most recently, he’s been especially enthusiastic about collaborative work with colleagues in the Philippines and countries in Central Asia, where he is studying how large language models can be harnessed to research teachers’ attitudes as expressed in local languages.
The Advancement of a Catalytic Research Ecosystem
Among the greatest drivers of Baker’s significant achievement, however, is the rise of an infrastructure better able to facilitate greater co-production. Baker notes that early on in his career, there was a much smaller community of researchers who, out of necessity, both sought to build learning systems—in particular intelligent tutoring systems—and also leverage them to support learning research and data generation.
The emergence of large-scale learning platforms, such as ASSISTments—which Baker has leveraged in his own studies—have been key in enabling education research. “Once you have a big platform with lots of learners, plus open research tools, the stage is set for new communities of specialized scholars and practitioners to emerge,” explains Baker.
“Once you have a big platform with lots of learners, plus open research tools, the stage is set for new communities of specialized scholars and practitioners to emerge."
But Baker notes that the key shift has been the pivot away from researchers having to first build their system to support data collection. “The big step was the emergence of research infrastructure tools that could be developed once and then re-used by lots of researchers,” Baker explains. In particular, Baker adds that the development of interaction log data sets and A/B testing were significant factors that have enabled the advancement of the fields of learning analytics and learning engineering.
By offering researchers public access to detailed records of user interactions with a digital system, interaction log data facilitates educational data mining for a broader community of external scholars. Baker himself actually served as the first technical director for the first large-scale public interaction data repository in education, the Pittsburgh Science of Learning Center DataShop.
As Baker explains, further developments in A/B testing—a method for comparing two versions of an educational tool or approach to assess impact on student outcomes—have also enabled researchers to conduct rapid iterative tests and quickly identify interventions that work.
Another key part of the ecosystem Baker has seen drive the growth of learning engineering is in the emergence of new funding models aimed at seeding innovation and supporting learning analytics in service of societal impact. Increasingly, funders are stepping up to support early data efforts and education R&D infrastructure, including A/B testing tools and other libraries for educational data mining.
Baker attributes much of his research success to the evolution of this ecosystem. “All of these things made it possible for the first-generation of specialized scholars in EDM and learning engineering to get funding for our work and do our work at all,” says Baker, referencing other researchers like Beck, Barnes, Heffernan, and Kizilcec.
Pioneering Work and Future Directions
Beyond his co-authorship achievement, Baker has long received wide recognition for his impressive work in educational data mining and learning analytics.
More than a dozen of the papers Baker has co-authored have received awards. Over the course of his career, he has pioneered models that automatically detect student engagement in a multitude of learning environments; reached more than 100,000 students through his MOOCs; co-led an eleven-year longitudinal study tracking the outcomes of middle school math students; and helped more than 150 researchers conduct field observations with his invention of the Baker Rodrigo Ocumpaugh Monitoring Protocol (BROMP), among other accomplishments.
In addition to his research pursuits, Baker has dedicated himself to helping drive further growth in the learning engineering and analytics ecosystem, founding the first master’s program in learning analytics and serving as founding president of the International Education Data Mining Society.
Now that he’s surpassed the famed Erdős in number of co-authors, what’s next for Baker?
When asked what he’s focusing on, Baker can’t help but conjure up a multitude of things he’s excited about—from the potential of Generative AI and his work on a ChatGPT teaching assistant being rolled out within Penn to continued explorations into algorithmic bias. “I feel like the world drives you into wanting to have one answer to that question, and that’s just not who I am,” he says. “I have 17. If I was working on one thing, I don’t think I could get myself up at four in the morning to start writing.”
At the heart of Baker’s drive to co-create and learn with other researchers is a genuine passion for the learning process and for the impact that can drive. And this is perhaps also reflective of continuing growth in the innovation community and increasing calls for investments in education R&D targeted at large-scale impact goals.
“My goal has never actually been to have the largest number of co-authors or to top Erdős,” Baker says. “My goal has been to do research that makes a difference, that genuinely expands the scope of knowledge and genuinely leads to better outcomes for kids.”
“My goal has never actually been to have the largest number of co-authors or to top Erdős. My goal has been to do research that makes a difference, that genuinely expands the scope of knowledge and genuinely leads to better outcomes for kids.”
And with that goal in mind, Baker’s drive to follow a variety of research pathways persists. In short, just as Erdős once might have said, Baker’s brain is open.
1 thought on “Surpassing Erdős: Ryan Baker’s Journey in Collaborative Education Research”
Ryan Baker’s remarkable achievements in educational research are truly inspiring. His dedication to exploring diverse areas within learning analytics and online education with others has produced more than 30,000 citations with more than 500 collaborators. Baker’s work serves as an inspiring example of how researchers can embrace collaboration and technological advancements to push the boundaries of their field. By leveraging new technologies and methodologies, he has significantly advanced the field of learning analytics, paving the way for improved educational outcomes for students worldwide.