SURV400: Fundamentals of Survey and Data Science

Area: 
Research Question
Credit(s)/ECTS: 
3/6
Core/Elective: 
Core course

The fields of survey methodology and data science draws on theories and practices developed in several academic disciplines – mathematics, statistics, psychology, sociology, computer science, and economics. To become an accomplished professional in these fields requires a mastery of research literatures as well as experience designing, conducting, and analyzing surveys and data from other sources, such as administrative records, social media, or transactions.
This course introduces the student to a set of principles of survey design and data science that are the basis of standard practices in these fields. The course exposes the student to research literatures that use both observational and experimental methods to test key hypotheses about the nature of human behavior and other factors that affect the quality of data. It will also present important statistical concepts and techniques in sample design, execution, and estimation, as well as models of behavior describing errors in responding to survey questions. Thus, both social science and statistical concepts will be presented.
The course uses the concept of total error as a framework to discuss coverage properties of sampling frames and organic data, alternative sample designs and their impacts on standard errors of statistics, different modes of data collection and generation, the role of interviewers and respondents in surveys, impacts of nonresponse and missing data on statistics, measurement errors in data, data processing, and data/research ethics.
The course is intended as an introduction to the fields of survey methodology and data science, taught at a graduate level. Lectures and course readings assume that students understand basic statistical concepts (at the level of an undergraduate course) and have exposure to elements of social science perspectives on human behavior. For those lacking such a background, supplementary readings are recommended.

Course objectives: 

By the end of the course, students will

  • be able to apply the key terminology used by survey methodologists and data scientists.
  • be able to assess the quality of data from different sources based on a data quality framework.
  • be able to select an appropriate data source to answer different types of research questions.
  • understand the influence of coverage, sampling, and nonresponse on data quality and know how to deal with deficiencies of the data.
  • have a clear understanding of the steps involved in data preparation, data processing, data analysis, and data visualization.
  • be able to comply with ethical standards in survey research and data science.
Grading: 

Grading will be based on participation in discussion during the online meetings, submission of questions via e-mail, demonstrating understanding of the readings and lectures (10%), weekly online exercises (60%) and a final online exam (30%).

Prerequisites: 

Students are expected to be familiar with basic statistical concepts, such as mean, standard deviation, variance, and distributions (at the level of an undergraduate course), and have exposure to elements of social science perspectives on human behavior.

Course syllabus: 

Course Dates

2016

Spring Term (March – May)

Summer Term (June – August)

2017

Spring Term (March – May)

2018

Fall Term (September – November)

2019

Spring Term (March – May)