# SURV725: Item Nonresponse and Imputation

Area:
Data Analysis, Data Curation/Storage
Credit(s)/ECTS:
1/2
Core/Elective:
Elective

Instructor: Jörg Drechsler

Missing data are a common problem which can lead to biased results if the missingness is not taken into account at the analysis stage. Imputation is often suggested as a strategy to deal with item nonresponse allowing the analyst to use standard complete data methods after the imputation. However, several misconceptions about the aims and goals (isn't imputation making up data?) of imputation make some users skeptical about the approach. In this course we will illustrate why thinking about the missing data is important and clarify which goals a useful imputation method should try to achieve (and which not).

Course objectives:

By the end of the course, students will…

• understand why the default way of dealing with missing data as implemented in most statistical software is often problematic.
• realize that it is better not to account for the missingness instead of applying simplistic imputation methods such as mean imputation or last-observation carried forward.
• know what is meant by a missing data mechanism and understand the implication of the different mechanisms.
• be familiar with the principle ideas and concepts of multiple imputation.

2 online quizzes (worth 20% total)
2 homework assignments (40% total)*
Participation in the weekly online meetings, engagement in discussions during the meetings and/or submission of questions via e-mail (10% of grade)
A final online exam (30% of grade)

Dates of when assignment will be due are indicated in the syllabus. There will be a grace period for late assignments (not for quizzes), but late assignments will be penalized according to the following rules:

1 day late: 10% off

2 days late: 25% off

3 days late: 50% off

4+ days late: no credit

Prerequisites:

Students should be familiar with generalized linear models and basic probability theory. The statistical software R will be used for illustrations and for (some of) the homework assignments. Thus, basic knowledge of R is required to be able to complete the assignments

Carpenter, J. and Kenward, M. (2012). Multiple imputation and its application. New York: John Wiley & Sons

Groves, R.M., Fowler, F.J., Couper, M.P., Lepkowski, J.M., Singer, E., Tourangeau, R. (2004) Survey Methodology, Wiley, Chapter 6

Little, R.J.A. and Rubin, D.B. (2002). Statistical Analysis with Missing Data (2nd ed.), New York: John Wiley & Sons, Sections 3.1, 3.2, and 3.4.

Little, R.J.A. and Rubin, D.B. (2002). Statistical Analysis with Missing Data (2nd ed.), New York: John Wiley & Sons, Chapter 4.

Brick, J.M. and Kalton, G. (1996). Handling missing data in survey research. Statistical Methods in Medical Research, 5, 215-238. Sections 1 and 3.1.

Carpenter, J. and Kenward, M. (2012). Multiple imputation and its application. New York: John Wiley & Sons, Chapter 2.1 to Chapter 2.4

Rubin, D.B. (1986). Basic ideas of multiple imputation for nonresponse. Survey Methodology, 12, 37-47.

Weekly online meetings & assignments:

• Week 1: Introduction & Missing Data Mechanisms (Quiz 1)
• Week 2: Default Strategies of (Not) Dealing with Missing Data and Their Implications (Assignment 1)
• Week 3: Common Misconceptions Regarding Imputation & Basic Imputation Methods (Assignment 2)
• Week 4: More Advanced Imputation Methods & Multiple Imputation (Quiz 2)
• Final exam
Recommendations:

If you want to dive even deeper into these topics, we recommend to sign up for the follow-up course SURV726 Multiple Imputation - Why and How.

2019

2022