This course and the textbook give students the necessary tools to design and select single- and multi-stage survey samples in the real world. We will cover topics on calculating a sample size for a specified level of precision or within the confines of the survey budget, identifying and creating strata, allocating the sample to the strata given a set of constraints or requirements for detectable differences between group estimates, estimating variance components, and determining what sample sizes to use at different stages in a multi-stage sample.
We will use specialized software for the calculations mentioned. This course will emphasize R but some examples in SAS and Stata are also discussed. Sample size calculations can be done using the R PracTools package written by the instructors or with Microsoft Excel; SAS procedures and Microsoft Excel are used for the mathematical programming (Unit 4). Survey weights can be computed with the R survey package for many designs and estimators—a topic covered in Part II of the Practical Tools series.
R is downloaded for free from http://cran.r-project.org/. Students may also find https://www.rstudio.com/ a helpful interface to execute program code. R packages for this class include, for example, PracTools (developed for the textbook), survey, and sampling. Three videos on the R survey package and five videos on PracTools are posted on http://jointprogram.umd.edu/all/our-faculty#cbp=https://jointprogram.umd.edu/content/richard-valliant. For those new to R, there are 48 MarinStatsLectures available at https://www.youtube.com/playlist?list=PLqzoL9-eJTNBDdKgJgJzaQcY6OXmsXAHU
There will be small-scale homework problems each week for students to gain practice using all methods covered in the course. The emphasis will be on using the methods to solve practical problems; we review theory as needed for a clear understanding of the underlying assumptions. All are encouraged to discuss their own survey design challenges and solutions during our weekly online meetings.
By the end of the course, students will understand:
Grading will based on
Sampling theory (e.g., SURV440) and Applied sampling (e.g., SURV626).
Some experience with the R statistical computing software is helpful.