SURV699J: Generalized Linear Models

Data Analysis

The main focus of this course lies on the introduction to statistical models and estimators beyond linear regression useful to social and economic scientists. Although very useful, the general liner model (linear regression) is not appropriate if the range of the dependent variable Y is restricted (e.g., binary, ordinal, count) and/or the variance of Y depends on the mean of Y. Generalized linear models extend the general linear model to address both of these shortcomings.

The course provides an overview of generalized linear models (GLM) that encompass non-normal response distributions to model functions of the mean of Y. GLMs thus relate the expected mean E(Y) of the dependent variable to the predictor variables via a specific link function. This link function is chosen such that it matches the data generating process of the dependent variable Y, therefore permitting the expected mean E(Y) to be non-linearly related to the predictor variables. Examples for GLMs are the logistic regression, regressions for ordinal data, or regression models for count data. GLMs are generally estimated by use of maximum likelihood estimation. The course thus not only introduces GLMs but starts with an introduction to the principle of maximum likelihood estimation. A good understanding of the classical linear regression model is a prerequisite and required for the course.

The first two units are dedicated to an introduction to maximum likelihood estimation while the rest of the units will then discuss generalized linear models (GLM) for binary choice decisions (Logit, Probit), ordinal dependent variables, and count data (Poisson, Negative Binomial).

All units will be accompanied by Quizzes to repeat and practice the topics from the units. Any statistic program can be used to solve the Quizzes. Solutions provided by the instructor will use the statistical packages Stata and R.

Course objectives: 

By the end of the course, students will…

  • Understand how to appropriately translate research questions into statistical models
  • Be able to apply statistical models appropriate for non-linear problems
  • Estimate regression parameters using the maximum likelihood principle
  • Perform hypothesis tests for regression models using the maximum likelihood principle
  • Be able to identify limitations of non-linear regression models
  • Be able to identify violations of the respective regression assumptions of the discussed GLMs



Grading will be based on:

  • 7 online quizzes (49% of grade total, 7% each)
  • Participation in online meetings and submission of questions demonstrating understanding of readings (10% of grade)
  • Final Exam (41% of grade)

Students must get a 70% or higher in order to pass the class


A sound understanding of linear regression models (OLS) is required. Knowledge in linear algebra and calculus is useful.

Course Dates


Spring Term (March – May)