Research

The Challenge

We know some teaching approaches lead to better student learning outcomes than others, but measuring what makes instruction more effective is challenging. Traditional approaches rely on small samples or overly simplistic metrics.

My Approach

I develop statistical classification and simulation methods for analyzing conversational data in educational settings. My work captures patterns of effective instruction from large-scale data: types of questions teachers ask, how they scaffold student thinking, and when students have breakthrough moments.

The Innovation

By combining measurement with validated simulation, I can test hypotheses about what instructional strategies support learning at scales that would be difficult to achieve with human participants alone. This enables evidence-based recommendations faster and more cost-effectively than traditional randomized trials.

Research Areas

Conversational Learning Analytics

Building statistical classifiers and temporal models to capture patterns of effective instruction from large-scale conversational data—question types, scaffolding moves, breakthrough moments, and engagement dynamics.

Validated Simulation for Education Research

Using simulation-based calibration to test hypotheses about teaching strategies at scales impossible with human participants—distinguishing simulation fidelity from interaction effectiveness.

AI-Enhanced Learning Environments

Developing AI-powered tools that combine expert human mentorship with adaptive simulation for technical skills education and personalized learning at scale.

Featured Research Projects

Simulated Teaching and Learning at Scale
In ProgressAI in Education

Simulated Teaching and Learning at Scale

Developing frameworks to evaluate AI-generated educational dialogues along two critical dimensions: simulation fidelity and interaction effectiveness.

AIeducationdialogue systems
AI-Enhanced Technical Interview Preparation
ActiveAI Applications

AI-Enhanced Technical Interview Preparation

Creating scalable, personalized technical interview practice for data science students by combining expert human interviews with AI simulation.

AItechnical interviewsdata science

Research Infrastructure

MathMentorDB

Submitted to LREC 2026

200,332 mathematics tutoring conversations from the Discord Mathematics community, with validated question-type classifications and breakthrough moment annotations. Built from 5.5 million raw messages with conversation disentanglement methods.

5.5M
Messages
200K+
Conversations

Cross-Domain Validation

Methods validated across multiple educational contexts to ensure generalizability:

  • NCTE Transcripts: 1,660 elementary math classroom lessons
  • TalkMoves Dataset: 567 K-12 math transcripts
  • Discord Tutoring: Online peer-to-peer mathematics help

Statistical Methods

Text Classification

Using LLMs as feature extractors within statistical frameworks to classify question types, scaffolding moves, and breakthrough moments with validated inter-rater reliability.

Temporal Modeling

Bayesian hierarchical Hawkes processes capture when and how rapidly students and tutors interact, revealing engagement patterns invisible to content analysis alone.

Validated Simulation

Simulation-based calibration distinguishes fidelity (realistic behavior) from effectiveness (genuine learning), treating LLMs as components in statistical models, not black boxes.

Research Grants

Publications