Portfolio of Work

As part of the Completion Exercise for M.S. students, you may present and defend a Portfolio of Work that demonstrates mastery of statistical methods, application and computation.

THE PORTFOLIO PRESENTATIONS ARE SCHEDULED DURING THE LAST FRIDAY OF MARCH ANNUALLY FROM 2:00PM to 4:00PM. THE PRESENTATIONS WILL BE FOLLOWED BY A RECEPTION. ALL MSS GRADUATE STUDENTS (FIRST AND SECOND YEAR) ARE INVITED TO THE RECEPTION. 

Portfolio Contents

  • Poster: Each student will create a Poster that they must present to a committee of three faculty members from the Department and includes material from two different projects (that may or may not be related);
  • A portfolio title that should be submitted to the MSD prior to your presentation (stat-msd@duke.edu)
  • A written description of one of the projects on your poster, including a discussion of how the experience relates to your field and a summary of what was learned (to MSD at stat-msd@duke.edu), along with copies of any non-proprietary documents or presentations you created during the internship period;
  • Any material you created as a research or teaching assistant;
  • Curriculum vitae (bring a current copy of your CV to your presentation and give it to your committee).

All students choosing a Portfolio of Work should follow the steps outlined in the Portfolio Presentation Process document:


Students will be evaluated by the faculty committee on the following: 

  • Achievement in core areas of statistical modeling, applied statistics and statistical computing;
  • Achievement in defining the ability to address and solve real-world problems with relevant statistical and computational methods;
  • Achievements in communicating in oral and written form with professional audience

Note that a student completing the MSS program have to satisfy all of the above 3 criteria at Satisfactory or Excellent level. A student will otherwise receive written feedback on those aspects marked Unsatisfactory, including comments on remedial paths recommended.

Select posters from Spring 2019 Portfolio Presentation

Download Example Poster 1 (pdf - 857.01 KB)
Download Example Poster 2 (pdf - 508.15 KB)

 

  • Using Gradient Boosting Machines to Build an Unconstrained Pure Premium Model
  • Text Classification for Conduct Surveillance and Price Prediction with Gradient Boosting Machines
  • Machine Learning in Pharmacodynamic Modeling of Anti-HIV Microbicide
  • Bayesian Hierarchical Approaches to Topic Modeling and Text Classification
  • Mixed Models to Investigate Sex Difference in Effects of Environmental Interaction on Cognitive Resilience
  • Cost Reduction Analysis with Pharmaceutical Insurance Claim Data and Prediction of Annual Influenza Vaccination Status
  • Text Classification of Active Directory Data with Long Short-term Memory Networks
  • Detecting Medical Insurance fraud with Ensemble clustering
  • Hierarchical Dirichlet Processes for Topic Modeling
  • Forecasting Models in Business Field - Applications in Real Estate and Ecommerce Short Text Classification and Financial Machine Learning
  • Models in Adult Income Prediction and Futures Hedging Strategy
  • Hyperparameter Tuning and Model Selection for Classification Problem
  • Applications of Machine Learning Methods for Classification
  • Highly Multiclass Text Classification in a Business Setting and Airbnb Listing Price Prediction
  • Application of Time-Varying Multivariate Models on Energy Consumption and Economic Data
  • Applied Signal Processing in Medical Device Development
  • Drivers of Course Rating and Models to Predict Ecommerce Sales
  • Multilabel Text Classification and Image Steganalysis
  • Multilevel Models Analysis and Optimization on Product Financial Data
  • Co-occurence Analysis on MIMIC Dataset
  • Clustering-Based Movie Recommendation System
  • Traffic Index Prediction and Word Embedding
  • Auto-Encoding Graph-Valued Data with Applications to Brain Connectomes and Recommender Systems
  • Applied Forecasting Models in Government Revenue Data
  • Identifying Significant Variables through Random Forest and Ridge Regression
  • Lorenz Interpolation: A Method for Estimating Income Statistics from Tabular Income Data
  • Identifying Musical Similarities Across Geographical Regions
  • Integrating Record Linkage and Propensity Score Matching
  • Spatio-Temporal Analysis of Gun Violence Victims and its Relation with Unemployment Rate in the USA
  • Nonlinear Regression and Network Inference for Neural Spike Count Data
  • Bayesian Item Response Modeling for Assessing State Interventions
  • Interpretable, Fair and Accurate Machine Learning for Criminal Recidivism
  • Developing a Clinical Decision Support Tool for Talaromycosis: A Case Study in Model Selection with Missing Data
  • Density Estimation with Mixture of Spherelets
  • Modified Leave-One-Out Cross-Validation for Linear Model Selection
  • Hierarchical Mixed Model for Influenza Outbreak Detection
  • Bayesian Hierarchical Model Evaluating Heart Surgical Program
  • Email Classification with Machine Learning
  • Hierarchical Modeling for Ranking Pediatric Heart Surgery Mortality
  • A Machine Learning Case Study from an Insurance Data Set
  • A Note of Hierarchical Incremental Gradient Descent on Riemannian Manifold
  • Web Attack Detection using Deep Learning
  • Generating Cartoon Characters with Style Generative Adversarial Network
  • A Statistical Model to Assess Hospitals Net Income and Rankings
  • Study of Hierarchical Model Applications on Amphetamines
  • Multivariate Linear Regression with Sparsity Estimators
  • Quantification of Cross-Shopping in E-commerce
  • Bayesian Diagnosis Model on Fever in Moshi, Tanzania
  • Analysis and Implementation of K-Mean++ with Parallel Initialization
  • Exploring Bayesian TIme-Series Models with Financial Data
  • Effect of Democratic Campaign Spending on 2018 House Midterms
  • A Two -stage Labeling Framework for Effective Text Classification 
  • Extensions of Predictive Models
  • Bayesian Applications in Time Series 
  • Applied Machine Learning: Classification and Regression Examples
  • Comparing the Performance of DID and LDV in Different Scenarios
  • An R-based Prediction Tool for Optimizing Forecast
  • Applications of Sampling and Clustering Methods
  • Phase Transitions in Linear Models and DID Causal Inference Analysis
  • Community Detection Thresholds in Heterogeneous Graphs
  • Using Biclustering Methods to Classify High Dimensional Data
  • The Application of TVAR Method on Financial Data
  • Approaches to Data Visualization and Prediction: Healthcare to Art
  • Application of Statistical Methods on Financial and Medical Data
  • Machine Learning Models in Health Care
  • Time Series Model in Inventory Optimization Management
  • Unsupervised Exploratory Analysis Tool for Biclustering
  • The Yelp Restaurant Recommendation System
  • Prediction of Default Risks with Statistical Models
  • Machine Learning Application in Video Game Outcome Prediction
  • Statistical Modeling and Insights in Financial Industry
  • Trends in Balloon Catheter Dilation of Paranasal Sinuses
  • Inferring Drug Innovation with Adverse Events 
  • Machine Learning Methods for Spatial and Financial Applications
  • Applied Bayesian Methods for Text Mining
  • Dynamic Factor Analysis in Internet Search Volume and Stock Volatility 
  • Comparing  Model-based Ranking Methods to Evaluate Physicians and Hospitals
  • Prediction of Medication Non-adherence with Clinical Notes
  • Evaluating Performance of Hospitals and Physicians using a Binomial Generalized Linear Mixed Model 
  • Text Analysis and Other Exploration
  • Deep Learning for the Automatic Grading of Diabetic Retinopathy 
  • Modeling Economic and Political Dynamics in the Middle East
  • Python Implementation of Bayesian Hierarchical Clustering
  • Implementation and Applications of Bayesian Hierarchical Clustering
  • Multi-Scale Topological Data Analysis to Identify Brain Fiber Connectivity for Biological Systems Applications
  • Bayesian Approach on Correcting Model Performance given Biased Estimates of Feature Values
  • Predicting Patient Admissions in the Medicare Shared Savings Program
  • Comparison of Machine Learning Methods in the Estimation of Housing Prices
  • Evaluating the Performance of a Generalized Recommendation Engine for the Financial Services Industry
  • Predictive Analytics in Healthcare and Medical Data Exploration
  • Establishing a Realistic Prior Model for Complex Geometrical Objects
  • Graph-Coupled HMMs and Deep Neural Network for Modeling Infection and Medical Diagnosis
  • Empirical Study of Topic Modeling in Movie Recommendation
  • Statistical Modeling and Traffic Violation Analysis
  • News' Predictive Power on St. Louis Fed Financial Stress Index
  • Application of Neural Networks with Joint Embedding for Medical Document Classification
  • Analysis and Implementation of Classification Algorithms (Kmeans + +, CONCOR)