The course will provide an introduction to record linkage: it will address methods to combine data on given entities (people, households, firms etc.) that are stored in different data sources. By showing the strengths of these methods and by providing numerous practical examples ranging from linked survey and administrative data to Big Data applications, the course will demonstrate the various benefits of record linkage. The participants will also learn about potential pitfalls record linkage projects may face.
The schedule of the course will be following a prototypical record linkage process:
Numerous practical examples will give participants an opportunity to create and discuss own ideas for promising record linkage projects. By the end of the course participants will enable to assess the feasibility of, plan and manage record linkage projects as well as to perform each step along an actual linkage process.
By the end of the course, students will…
Grading will be based on:
Dates of when assignment will be due are indicated in the syllabus. Extensions will be granted sparingly and only with prior arrangement with the instructors.
Students should have knowledge of basic statistical concepts. They should have an advanced knowledge of R or Stata. A basic understanding of regular expressions is useful but not strictly required.