SURV725: Item Nonresponse and Imputation

Area: 
Data Analysis
Credit(s)/ECTS: 
1/2
Core/Elective: 
Elective

Missing data are a common problem which can lead to biased results if the missingness is not taken into account at the analysis stage. Imputation is often suggested as a strategy to deal with item nonresponse allowing the analyst to use standard complete data methods after the imputation. However, several misconceptions about the aims and goals (isn't imputation making up data?) of imputation make some users skeptical about the approach. In this course we will illustrate why thinking about the missing data is important and clarify which goals a useful imputation method should try to achieve (and which not).

Course objectives: 

By the end of the course, students will…

  • understand why the default way of dealing with missing data as implemented in most statistical software is often problematic.
  • realize that it is better not to account for the missingness instead of applying simplistic imputation methods such as mean imputation or last-observation carried forward.
  • know what is meant by a missing data mechanism and understand the implication of the different mechanisms.
  • be familiar with the principle ideas and concepts of multiple imputation.

 

Grading: 

Grading will be based on:

  • 2 online quizzes (worth 20% total)
  • 2 homework assignments (40% total)
  • Participation in the weekly online meetings, engagement in discussions during the meetings and/or submission of questions via e-mail (10% of grade)
  • A final online exam (30% of grade)
Prerequisites: 

Students should be familiar with generalized linear models and basic probability theory. The statistical software R will be used for illustrations and for (some of) the homework assignments.

Course syllabus: 

Course Dates

2017

Fall Term (September – November)

2018

Fall Term (September – November)