Collaborative Research Team Projects – Project 28
Managing Data Complexities in Electronic Health Records: Statistical Strategies for Enhanced Learning
Worldwide initiatives are making electronic health records (EHRs) available for research, adding a level of clinical information to the healthcare administrative data that has hitherto been lacking. However, researchers using EHR data face significant issues: missingness, informative observation schemes, confounding, measurement error, misclassification, and high-dimensionality. This project will expand the capabilities of EHR research and will improve the evidence-base on which healthcare decisions are made.
Research Category:
Region: National
Date: 2025-2028
Why Do We Need New Methodologies to Manage EHR Data?
Researchers are increasingly gaining access to data contained in EHRs. Using EHRs to inform healthcare decisions requires contending with the realities of data that were not collected for research purposes: missingness, informative observation schemes, confounding, measurement error, misclassification, and high-dimensionality. Moreover, these issues often manifest simultaneously and in their most difficult forms: for example, missingness not at random is often the most plausible missingness mechanism, and differential misclassification is likely to be present. Furthermore, while the sequential randomization assumption needed to make causal inferences is often more plausible in EHR data than in administrative claims data, typical approaches to reducing the dimension of the confounding set risk violating sequential randomization. Methodological innovations are thus necessary.
Problems Addressed and Anticipated Outcomes
The problems outlined in this project have been developed collaboratively and address issues encountered in EHR-based research led by our clinical collaborators:
- Informative presence bias
- Handling visiting and missingness not at random
- Causal inference with time-varying longitudinal exposures
- Causal inference with high-dimensional confounders
The project team will work toward the realization of three outcomes:
- A suite of methodological approaches to causal inferences from EHR data that is tailored to the data’s complexities and that will advance our ability to use EHR data
- A team of collaborators working at the cutting edge of innovation in their own disciplines who will apply our tools and methods to questions they are currently addressing
- A cohort of new (student) investigators equipped to handle the complexities involved in EHR data and capable of contributing to methodological development through collaborative work with colleagues from the health sciences
People Behind the Project
Project Team
Eleanor Pullenayegum | University of Toronto, The Hospital for Sick Children
Mireille Schnitzer | Université de Montréal
Grace Y. Yi | University of Western Ontario
Collaborators
Genevieve Lefebvre | Université du Québec à Montréal
Janie Coulombe | Université de Montréal, Hôpital Sainte-Justine
David Benkeser | Emory University
Cristina Longo | Université de Montréal, Hôpital Sainte-Justine
Sonia Grandi | University of Toronto, The Hospital for Sick Children
Brian Feldman | University of Toronto, The Hospital for Sick Children