Probably Approximately Correct Causal Discovery

David Page, Department of Biostatistics and Bioinformatics, Duke University School of Medicine

Friday, February 21, 2020 - 3:30pm

This talk begins by reviewing some recent applications of machine learning to electronic health records.  It then presents and empirically evaluates some novel machine learning algorithms for associating drugs with adverse or fortuitous events they may cause.  The general approach is to add to machine learning algorithms a patient-specific baseline risk, as in self-controlled case series.  Nevertheless, while these algorithms perform well empirically, unlike much work in causal discovery or causal inference, they do not provide conditions under which provably correct causal discoveries are guaranteed.  Much of the rest of machine learning outside of causal discovery is satisfied with probably approximately correct (PAC) models or discoveries, and with algorithms that yield high empirical accuracy rather than perfect predictions.  Therefore, analogous to PAC-learning, this talk proposes a theoretical model of "PAC causal discovery (PACC-discovery)."  Intuitively, PACC-discovery  requires an inference or learning algorithm that can, with high probability, accurately distinguish between two competing causal models, or randomized Turing machines, that differ only by a causal association in question.  As with PAC-learning, the inference or learning algorithm receives time and data polynomial in the complexity of the models and the desired accuracy.

Seminars generally take place in 116 Old Chemistry Building on Fridays from 3:30 - 4:30 pm. For additional information contact: karen.whitesell@duke.edu or phone 919-684-8029. Sorry, but we do not have reprints available. Please feel free to contact the authors by email for follow-up information, articles, etc. Reception following seminar in 203B Old Chemistry.

Old Chemistry 116

Location Info