SURV400: Fundamentals of Survey and Data Science

Area: 
Research Question
Credit(s)/ECTS: 
3/6
Core/Elective: 
Core course

Apply through UMD

Instructor: Alexander Wenz

The fields of survey methodology and data science draws on theories and practices developed in several academic disciplines – mathematics, statistics, psychology, sociology, computer science, and economics. To become an accomplished professional in these fields requires a mastery of research literatures as well as experience designing, conducting, and analyzing surveys and data from other sources, such as administrative records, social media, or transactions.

This course introduces the student to a set of principles of survey design and data science that are the basis of standard practices in these fields. The course exposes the student to research literatures that use both observational and experimental methods to test key hypotheses about the nature of human behavior and other factors that affect the quality of data. It will also present important statistical concepts and techniques in sample design, execution, and estimation, as well as models of behavior describing errors in responding to survey questions. Thus, both social science and statistical concepts will be presented.

The course uses the concept of total error as a framework to discuss coverage properties of sampling frames and organic data, alternative sample designs and their impacts on standard errors of statistics, different modes of data collection and generation, the role of interviewers and respondents in surveys, impacts of nonresponse and missing data on statistics, measurement errors in data, data processing, and data/research ethics.

The course is intended as an introduction to the fields of survey methodology and data science, taught at a graduate level. Lectures and course readings assume that students understand basic statistical concepts (at the level of an undergraduate course) and have exposure to elements of social science perspectives on human behavior. For those lacking such a background, supplementary readings are recommended.

Course objectives: 

By the end of the course, students will

  • be able to apply the key terminology used by survey methodologists and data scientists.
  • be able to assess the quality of data from different sources based on a data quality framework.
  • be able to select an appropriate data source to answer different types of research questions.
  • understand the influence of coverage, sampling, and nonresponse on data quality and know how to deal with deficiencies of the data.
  • have a clear understanding of the steps involved in data preparation, data processing, data analysis, and data visualization.
  • be able to comply with ethical standards in survey research and data science.
Grading: 

Grading will be based on:

  • Participation in discussion during the weekly online meetings and contributions to the forum (deadline is Sunday, 1:00 PM EDT/7:00 CEST) demonstrating understanding of the required readings and video lectures is worth 10% of the final grade.
  • Ten online quizzes, worth 100 points each, reviewing specific aspects of the material covered. The simple average of the points across all quizzes is worth 60% of the final grade.
  • A final open-book online exam, worth 100 points, is worth 30% of the final grade.
Prerequisites: 

Students are expected to be familiar with basic statistical concepts, such as mean, standard deviation, variance, and distributions (at the level of an undergraduate course), and have exposure to elements of social science perspectives on human behavior.

Readings:

Groves, R.M., Fowler, F.J. Jr., Couper, M.P., Lepkowski, J.M., Singer, E., & Tourangeau, R. (2009). Survey Methodology, 2nd Edition. New York: Wiley.

Peng, R.D. & Matsui, E. (2015). The Art of Data Science. A Guide for Anyone Who Works with Data. Leanpub. 

Weekly online meetings & assignments:

  • Week 1: Introduction – How to do survey research and data science (Quiz 1)
  • Week 2: Quality of Data (Quiz 2)
  • Week 3: Coverage (Quiz 3)
  • Week 4: Modes of Survey Data Collection (Quiz 4)
  • Week 5: Data Generation from Other Sources (Quiz 5)
  • Week 6: Sampling I (Quiz 6)
  • Week 7: Sampling II (Quiz 7)
  • Week 8: Questionnaires and Interviewing (Quiz 8)
  • Week 9: Nonresponse (Quiz 9)
  • Week 10: Data Preparation, Data Processing, and Data Base Management (Quiz 10)
  • Week 11: Data Analysis and Data Visualization 
  • Week 12: Survey and Research Ethics
  • Final Exam 

Course Dates

2019

Fall Semester (September – December)

2020

Fall Semester (September – December)

2022

Fall Semester (September – December)