## STA113: Probability and Statistics in Engineering

### Optional Term Project

#### Due: 2pm Tuesday, April 29, 2003

Students who would prefer to do a project instead of the final exam may do so. Projects:

1. Must include a statistical analysis of some data set (I can provide some or you can provide your own);
2. Must involve regression analysis or ANOVA (both topics covered in the last 1/3 of the course and in M&S chapters 11-14);
3. Must be between 5 and 10 pages long. Computers should be used, but the project should be a paper and not just computer output--- include only the relevant plots or tables, and describe in your own words what light they shed on the scientific problem at hand;
4. Are due any time before the scheduled final exam (2pm Tuesday April 29).

A typical project will begin with a description of a scientific question and a description of (and full citations for) some data set taken in the hope of answering that question; a description of the statistical methods used to help illuminate the evidence offered by the data; some critical analysis of the statistical model used (graphical methods are especially useful here--- scattergrams, residual plots, etc. will be helpful. Did you have to transform one or more of the variables? Why? Did you have to include a quadratic term? How did you handle your variable selection problem? Are you satisfied that the assumptions of linearity, equality of variance, and approximate normality are satisfied? Why?); and the conclusions that your analysis helps you to draw in the context of the original problem. You may find the material of M&S chapter 14 to be helpful here.

One source of data sets is the book A Handbook of Small Data Sets by Hand et al. While the book's 510 data sets are described only in the book itself (you can borrow my copy in my office, and photocopy whatever data sets you like), the data sets (just numbers, no stories) are on-line; you can get to them by following the Data link from the Home or Syllabus pages, then take the Hand et al. link from there. I have other books with data, too, but you'd have to type in the numbers yourself; you might also find something that interests you at one of the on-line data archives; a good place to start looking is here (for example, CMU's StatLib and its Data & Story archive are good places to start).

Projects must demonstrate mastery of a range of statistical ideas; routine binomial analysis of survey data would not be appropriate. Group projects are welcome but would have to be substantially longer and deeper than individual projects, and must show each participant's specific contribution in detail.

Ask by e-mail (wolpert@stat.duke.edu or feng@stat.duke.edu) or in person if you have additional questions. You can find us before or after class, in our Office Hours, or at other times we're not teaching or away. We're also happy to look over outlines or drafts and give you some feedback and suggestions UNTIL THE LAST WEEK OF CLASS. Sorry, but we will have little if any time during reading and exam weeks--- please start your projects early if you would like some feedback or help on them.