Learning Engineering Hub
Build your own learning engineering solutions
Build
These publicly available datasets, dataset collections, tools, libraries, frameworks, and funding opportunities may help you as you conduct research and build your own learning engineering technology solutions. Do you think that a resource is missing? Suggest it to us.
Last Updated
Download Available?
Dataset | Organization(s) | Description | Location | Scale Of Study | Last Updated | Download Available? |
---|
DataShop@CMUA data repository and web application for learningscience researchers that provides secure data storage plus analysis and visualization tools.
Deep Mind Mathematical Dataset (Analysing Mathematical Reasoning Abilities of Neural Models) A total of 20 mathematical evaluation datasets, widely used in dozens of top artificial intelligence conferences such as ACL, AAAI, and ICLR since 2010 till now, have been collected.
E-TRIALS Datasets from ASSISTments
A collection of datasets related to grade school students’ interactions with the online learning math platform called ASSISTments.
Google Dataset SearchGoogle’s search engine for datasets.
Hugging FaceA collection of datasets, models, and more resources for developing AI models and conducting research.
ICPSRMaintains a data archive of more than 250,000 files of research in the social and behavioral sciences.
IPEDSA collection of data and general information on U.S. colleges, universities, and technical and vocational institutions.
KaggleContains over 50,000 public datasets and 400,000 public notebooks for data analysis.
LDbaseAn open science resource for the educational and developmental science scientific communities, providing a secure place to store and access data, as well access materials about aspects of data management and analyses.
LearnSphereA collection of tools, including data repositories, for learning research.
NCES International Data ExplorerA platform for exploring student and adult performance on international assessments.
OER HubA public digital library of open educational resources.
Our World in DataFree and open source charts and datasets on the world’s largest problems.
Papers with CodeA free and open resource with Machine Learning papers, code, datasets, methods and evaluation tables.
Roper CenterA repository of public opinion and survey data operated out of Cornell University.
Open Game DataAn open-source collection of educational game datasets.
Anaconda
Anaconda provides an open source
package
library and package management system for Python and R for scientific
computing, including data science, machine learning, data processing, and
predictive analytics.
CMU PLUSCMU PLUS tutor training lessons are freely available to all tutoring
organizations. They have made these public on their site at://tutors.plus.
CTAT
Carnegie Mellon University’s Cognitive Tutor Authoring Tools (CTAT) is a tool for educational researchers,
regardless of their coding expertise, to develop cognitive tutors that guide students through problems and offer
timely and relevant assistance.
DataShopDataShop is a data repository and web application that provides secure
data storage, analysis and visualization tools for learning science researchers. It aims to support scientific
discovery in education by offering extensive public and private datasets and tools for educational research.
Doccano
Doccano is an open-source platform for text annotation tool, providing annotation features for text
classification, sequence labeling, and sequence to sequence tasks.
GitHub
GitHub allows developers to collaboratively store, manage, track, and
control changes to their code.
Google
Colab Google Colab allows developers to write and execute Python code in a
browser, making sharing and collaborating easier, and access computing
resources such as GPUs and TPUs.
Hugging
FaceIn addition to over 93,000
datasets, Hugging Face is
an open source platform that allows access to open-source
machine learning collaboration on large language models, datasets, and other
applications.
LearnLab
Carnegie Mellon University’s LearnLab works to enhance the scientific understanding of effective learning in
educational contexts and develop a research infrastructure for field experimentation, data collection, and data
mining.
LearnSphereLearnSphere is a community software infrastructure that supports
sharing, analysis and collaboration across a wide variety of educational data. LearnSphere supports researchers
as they improve their understanding of human learning. It also helps course developers and instructors improve
teaching and learning through data-driven course redesign.
LKT: Logistic Knowledge TracingA tool for computing Logistic Knowledge Tracing
(LKT), a method to track learning in an educational software system.
LLMs-4-EDU Citation GroupAn open-access library of education-focused LLMs and research resources developed by John Whitmer. Once signed up, members can download, review, and contribute citations for peer-reviewed and pre-print publications. We are particularly interested in new research studies with outcome / impact evaluations of LLMs in applied settings. Please add new references to the “a – uncategorized” group, and they will be organized into the appropriate category.
MoFaCTSMoFaCTS is an educational platform that supports data-driven learning
through content modules, user management, and reporting features for admins and teachers.
ParlAI
ParlAI offers popular datasets, reference models, and integration of Amazon
Mechanical Turk to share, train, and evaluate dialogue models across tasks
such as open-domain chat, task-oriented dialogue, and visual question
answering.
Penn Center
for Learning AnalyticsPenn Center for Learning Analytics offers a collection of open-source
tools and frameworks related to educational data.
PyTorchPyTorch is an end-to-end machine learning framework that facilitates
experimentation and production through a C++ front-end platform, distributed
training, and resources ecosystem.
Ryan Baker’s Educational Tools and FrameworksThis page provides a collection of
open-source educational tools and frameworks developed by Ryan Baker and colleagues.
Scikit-learnScikit-learn is an open source machine learning library for
Python. The
library includes tools for foundational ML practices such as model fitting,
data preprocessing, model selection, and model evaluation.
TensorFlowTensorFlow is an end-to-end machine learning platform that shares
tools
for preparing data, building and deploying models, and implementing MLOps.
TigrisTigris is a workflow authoring tool that is part of the community software
infrastructure being built for the LearnSphere project. The platform facilitates the creation and sharing of
custom analyses, as well as interactions with external repositories, such as DataShop, MOOCdb, DiscourseDB, and
DataStage.
Torus
Building on the work of Carnegie Mellon University’s Open Learning Initiative, Torus is an open platform
allowing users to author, deliver, improve, and research learning experiences.
The UDL
Guidelines The Universal Design for Learning (UDL) Guidelines are a tool within the UDL
framework, aimed at enhancing teaching and learning. This tool benefits educators, curriculum developers, and
researchers to provide accessible, engaging, and challenging learning opportunities for all learners.
Unzin’s
Data Platform The Unizin Data Platform (UDP) integrates, normalizes, and warehouses educational
data from diverse sources like LMS, SIS, and LTI tools, enabling higher education institutions to effectively
utilize learning analytics and data for student success initiatives.
WikiData for Education
Wikidata for Education is an initiative that aims to align open educational resources with local, national, and
international curriculum frameworks to support teachers and students in achieving their educational goals.
WolframWolfram offers free public resources in advanced computation,
includingWolframAlpha, a platform that
introduces knowledge-based computing through algorithms
and AI in Math, Science, and Technology.
The field of learning engineering is continuously evolving, with new opportunities for research, development, and innovation emerging regularly. Below is a collection of various funding opportunities, including grants, scholarships, and sponsorships.
Arnold Ventures
Arnold Ventures (AV) Is A Philanthropic Entity Focused On Enhancing The Well-Being Of Americans Through
Policy Solutions Rooted In Evidence, Designed To Broaden Opportunities And Reduce Inequalities. The
Organization Offers Various Requests For Proposals (RFPs) That Aim To Strengthen The Foundation Of Knowledge
On Policies, Programs, And Interventions Through Funding Rigorous Research Efforts.
Learning Landscapes Challenge
The Learning Landscapes Challenge Aims To Help Innovators Combine Social, Digital, And Physical
Frameworks To Connect Current Educational Methods With Future Learning Strategies. This Multi-Stage
Competition Will Offer Participants Financial Support, Professional Guidance, And Entry Into A Network
Of Partners Sharing Similar Goals, Assisting Them In Expanding Their Initiatives.
Community Funding Accelerator
The Community Funding Accelerator Helps Grantees Identify Grant Opportunities, Provide Technical Assistance,
And Build Coalitions Toward K-12 Education And Workforce Innovations.
Tools Competition
The Learning Engineering Tools Competition Is A Multi-Million Dollar Competition For Edtech Innovation That
Leverages Digital Technology, Big Data, And Learning Science To Meet The Urgent Needs Of Learners Worldwide.
WorkforceGPS ETA Grants
WorkforceGPS Is Sponsored By The Employment And Training Administration Of The US Department Of Labor And
Aims To Facilitate Innovation In Employment Prospects, Practices, And Future Skills Development. It Lists
Federal Grant Opportunities And Resources For Workforce Professionals, Educators, And Business Leaders.
Expanding AI Innovation Through Capacity Building And Partnerships (ExpandAI)
ExpandAI Is An NSF Program That Aims To Advance Research And Development In AI And AI-Powered Innovation.
Y Combinator
Y Combinator Offers Two 3-Month Startup Programs, Office Hours, And $500,000 Of Funding To Companies That Are Accepted.
DOHE
The Go-Together Acceleration Programme Aims To Support EdTech Startups That Prioritize Making Education Fairer, More Accessible, And More Resourceful To Grow Their Innovations At A Global Scale.
LearnLaunch
LearnLaunch Is An Edtech Startup Accelerator That Supports EdTech Companies That Aim To Improve Teaching And Learning. The Program Offers Milestone-Based Funding, Partner Support, And Mentorship To Help Companies Develop And Scale.
Institute For The Future Of Education – TecPrize Challenge
The TecPrize Challenge Is A Global Competition That Funds Up To 8 Companies Per Year To Accelerate Innovative And Transformative Educational Tools.
Lighthouse Labs
Lighthouse Labs Invests In Early-Stage Startups And Provides Them With Equity-Free Funding, Tailored Mentorship, Scaling Education, And A Cohort Community.
Fast Grants For STEM Talent Identification & Development
Fast Grants For STEM Talent Identification & Development Run By Digital Harbor Foundation Is Open For Researchers With Bold Ideas Around Talent Identification And Development To Receive Early Seed Funding.
AI Grant
The AI Grant Accelerator Program Offers Funding Opportunities For Startups In The AI Space.
Student Upward Mobility Initiative
The Student Upward Mobility Initiative Funds Research Focused On Developing, Identifying, And Validating Educational Skills And Competencies That Enhance Economic Mobility For PK-12 Students.
Robin Hood AI Poverty Challenge
Robin Hood’s AI Poverty Challenge Seeks Innovative Solutions That Leverage Artificial Intelligence To Enhance Upward Mobility And Reduce Poverty. The Challenge Invites Participation From Nonprofits, For-Profits, And Government Entities, Aiming To Inspire Broader Use Of AI’s Potential To Combat Poverty.
Learning Engineering Hub
Contact Us For More Information