SURV673: Introduction to Python and SQL

Area: 
Data Curation/Storage
Credit(s)/ECTS: 
1/2
Core/Elective: 
Elective

Python has recently seen a huge surge not only as a programming language, but also as a tool for data analysis. In this course, we will introduce the basics of programming in Python for the purposes for data analysis. We will explore the Longitudinal Employer-Household Dynamics (LEHD) datasets, specifically the LEHD Origin-Destination Employment Statistics (LODES) datasets, using Python to read in datasets, explore the datasets, find statistical summaries, and create visualizations. By the end of the course, students should be comfortable with using Python for data analysis, as well as be capable of using their general knowledge of the Python language for other applications.

In addition, as more and more data becomes available, relational database management systems (RDBMS) have become increasingly popular because it allows people to relatively easily organize large amounts of data. In many cases, knowledge of SQL is crucial to being able to access this data. In this course, we will introduce the basics of programming in SQL using PostgreSQL. We will explore the Longitudinal Employer-Household Dynamics (LEHD) datasets, specifically the LEHD Origin-Destination Employment Statistics (LODES) datasets, using SQL to explore the datasets and find statistical summaries. By the end of the course, students should be comfortable with constructing basic queries of the database and linking multiple tables together using SQL.

Course objectives: 

By the end of the course, students will

  • Understand the basic structure of how Python and object-oriented programming works
  • Be able to write basic Python code, including functions and loops
  • Know how to use Pandas and matplotlib packages in Python to analyze data and create visualizations
  • Be comfortable reading error messages and Python documentation to diagnose and debug code
  • Understand how relational databases work
  • Be able to construct a query to answer questions about the data
  • Understand how joins work and how to use them
Grading: 

Grading will be based on:

  • 4 online quizzes (5% each)
  • Participation in discussion during the weekly online meetings and submission of questions via e-mail demonstrating understanding of required readings and video lectures (20% of grade)
  • 4 homework assignments (15% each)
Prerequisites: 

No prerequisite

Course Dates

2018

Summer Term (June – August)

Fall Term (September – November)