Instructors: Hanna Brenzel, Piet Daas, Marco Puts
The course gives an overview about various current topics in official statistics such as Big Data, georeferencing and microsimulation. The topics are therefore independent of one another but must be considered together when it comes to current advanced topics in official statistics. The course provides an overview on the statistical use of Big Data and provides essential background knowledge to enable this. The course consists of 3 sections: one on Big Data, composed of two lectures, and two on using georeferenced data (i.e. data that is attached to a unique location) and applying microsimulation techniques; one lecture each. A considerable number of examples will be discussed. The Big Data section will provide a general look at the benefits and downsides of Big Data in official statistics. It also is the starting point for Big Data methodology development. In addition, the relation between Big Data analysis and the various Big Data IT environments is discussed.
The second part of the course lies on the importance and use of georeferenced data. The goal of Eurostat, the European National Statistical Offices and the Federal Statistical Office is to provide statistical information that is necessary for decision-making processes in a democratic society. In times of open data, georeferencing creates a new, expanded basis for evidence-based decisions with regional relevance. The expansion of the regional reference consists in the spatial depth and flexibility of the analyzed regional units. Information systems based on integrated statistical and geographic information can support the political process and forecasts, while at the same time bringing the different dimensions of sustainable development (ecological, economic, social) into a coherent picture. Using geospatial information in the production of statistics has numerous advantages and creates a statistic overarching potential for analysis, which goes far beyond small-scale cartographic presentation of single statistics. In addition, the analysis potential based on official data will be extended by integrating geospatial information.
A third focus of the course lies on microsimulation. The planning and further development of political decisions increasingly requires the use of special simulation and calculations that go beyond the published statistical standard tables of official statistics, in order to be able to evaluate and estimate consequence of political measures. The course provides a basic overview about the idea of microsimulation, its origin and development over time and highlights the strong relationship between the developments of microsimulation and access to individual data. Moreover, the different types of microsimulation will be presented as well as insights into ongoing projects in Germany will be given.
By the end of the course, students will…
Grading will be based on:
Dates of when quizzes will be due are indicated in the syllabus. Extensions will be granted sparingly and are at the instructor's discretion.
Daas, P.J.H., Puts, M.J.H. (2014) Big Data as a Source of Statistical Information. The Survey Statistician 69, 22-31.
Ginsberg et al. (2009) Detecting influenza epidemics using search engine query data. Nature 457, 1012-1014.
Lazer et al. (2014) The Parable of Google Flu: Traps in Big Data Analysis. Science 343(6176), 1203-1205.
Orcutt (2007) A new type of socio-economic system. International Journal of Microsimulation (Reprinted), 1(1), pp. 3-9.
Weekly online meetings & assignments: