Instructor: Diego Fregolent Mendes des Oliveira
Python has recently seen a huge surge not only as a programming language, but also as a tool for data analysis. In this course, we will introduce the basics of programming in Python for the purposes for data analysis. We will explore the Longitudinal Employer-Household Dynamics (LEHD) datasets, specifically the LEHD Origin-Destination Employment Statistics (LODES) datasets, using Python to read in datasets, explore the datasets, find statistical summaries, and create visualizations. By the end of the course, students should be comfortable with using Python for data analysis, as well as be capable of using their general knowledge of the Python language for other applications.
In addition, as more and more data becomes available, relational database management systems (RDBMS) have become increasingly popular because it allows people to relatively easily organize large amounts of data. In many cases, knowledge of SQL is crucial to being able to access this data. In this course, we will introduce the basics of programming in SQL using PostgreSQL. We will explore the Longitudinal Employer-Household Dynamics (LEHD) datasets, specifically the LEHD Origin-Destination Employment Statistics (LODES) datasets, using SQL to explore the datasets and find statistical summaries. By the end of the course, students should be comfortable with constructing basic queries of the database and linking multiple tables together using SQL.
By the end of the course, students will
Grading will be based on:
No prerequisite
Readings:
LEHD Origin-Destination Employment Statistics (LODES) OnTheMap: Data Overview (LODES Version 7)
LEHD Origin-Destination Employment Statistics (LODES) Dataset Structure
Weekly online meetings & assignments: