Rigorous introduction to health data science using current applications in biomedical research, epidemiology, and health policy. Use modern statistical software to conduct reproducible data exploration, visualization, and analysis. Interpret and translate results for interdisciplinary researchers. Critically evaluate data-based claims, decisions, and policies. Includes exploratory data analysis, visualization, basics of probability and inference, predictive modeling and classification. This course focuses on the R computing language. No statistical or computing background is necessary.
Introduction to the mathematics and algorithms that are central to a variety of data science applications. Basic mathematical concepts underlying popular data science algorithms will be introduced and students will write code implementing these algorithms. We will discuss the impact of these algorithms on society and ethical implications. Algorithms examined include: Google's pagerank, principal component analysis for visualizing high dimensional data, hidden Markov models for speech recognition, and classifiers detecting spam emails.
Introduction to basic principles of analyzing relational data. Consider deterministic and probabilistic specifications of networks and graphs, studying structural blockmodels, the Erdos-Renyi model, the exponential random graph model, the stochastic blockmodel, generalizations to latent space models and to more complex relational data. Development of these models and practical understanding of how to fit them. There is no book, lectures will be supplemented with discussions of relevant papers. Prerequisite: Statistical Science 360.
Intro to data science and statistical thinking. Learn to explore, visualize, and analyze data to understand natural phenomena, investigate patterns, model outcomes, and make predictions, and do so in a reproducible and shareable manner. Gain experience in data wrangling and munging, exploratory data analysis, predictive modeling, and data visualization, and effective communication of results. Work on problems and case studies inspired by and based on real-world questions and data. The course will focus on the R statistical computing language.
Duke is known for the depth of its expertise in Bayesian statistics, a field that has only risen in popularity with the advent of greater computational and data resources. Augment your skills in Bayesian statistics in our popular course, with a focus on foundational theory and computational methods.
(No longer offered but listed for historical reasons.)
Reading and interpretation of statistical analysis from life and health sciences. Topics include: basic concepts and tools of probability, estimation, inference, decisions analysis, and modeling. Emphasizes role of biostatistics in modern society. Taught in Beaufort at Duke Marine Lab. See department website for placement information. Not open to students who have taken Statistical Science 100 or above.
Reading and interpretation of statistical analysis from life and health sciences. Topics include: basic concepts and tools of probability, estimation, inference, decisions analysis, and modeling. Emphasizes role of biostatistics in modern society. See department website for placement information. Not open to students who have taken Statistical Science 100 or above.
Introduction to statistics as a science of understanding and analyzing data. Themes include data collection, exploratory analysis, inference, and modeling. Focus on principles underlying quantitative research in social sciences, humanities, and public policy. Research projects teach the process of scientific discovery and synthesis and critical evaluation of research and statistical arguments. Readings give perspective on why in 1950, S.