IPSDS webinar series

April 4, 2018

Last year, IPSDS started an online Brown Bag seminar, where one of our participants presented his work project. We enjoyed such a format a lot and would like to continue organizing such meetings. Our next presenter is an external guest Maura Bardo who is a mathematical statistician at the U.S. Energy Information Administration. She will be talking about her current project, where she works on identifying and integrating alternative sources of data with the EIA survey of retail gasoline stations (see an abstract and bio below).

Title: Linking a Retail Gasoline Price Survey with Commercial Data

Maura Bardos (Energy Information Administration), Amerine Woodyard (Energy Information Administration), Jeramiah Yeksavich (Energy Information Administration)


As a part of ongoing modernization efforts, the U.S. Energy Information Administration (EIA) is conducting research on utilizing third-party sources to supplement publicly available data. EIA is uniquely situated since a number of its surveys collect information that is also compiled and sold by commercial vendors. These commercial vendors can provide almost real-time frequency of data that when linked with surveys, have the potential to reduce respondent burden and enhance data products. However, much is unknown about vendor’s sector visibility, data definitions, processing, and data quality. This study examines an example of integrating of survey and commercial records for a weekly business survey.

The Motor Gasoline Price Survey (EIA-878) is a weekly mandatory survey of about 800 retail gasoline stations across the country. The data collected are used to create point-in-time estimates of gasoline prices at the national, regional, and selected state and city levels by grade and formulation, resulting in 276 published price estimates. Data collection, processing, and dissemination are completed within the same day. In summer 2017, EIA obtained two commercial sources of gasoline price data for research purposes. EIA purchased price data from a commercial vendor for about 110,000 stations. We also created a tool to obtain gas prices via a crowdsourced website.

We use geospatial analysis to match survey data at the station-level to commercial records and present descriptive statistics on linkage rates, availability of prices, and data quality for the commercial sources. Using the matched file, we compute estimates by city, state, and region over time and analyze the congruence between the EIA-878 and commercial sources. Based on this research, we provide an assessment of the extent to which commercial sources could be incorporated into the EIA-878. We conclude with a discussion of implications for future efforts to integrate survey and commercial datasets.


Maura Bardos joined the Office of Survey Development and Statistical Integration at the U.S. Energy Information Administration (EIA) in 2016 as a Mathematical Statistician. In this role, she focuses on statistical methods for petroleum and natural gas projects. The work presented today is one component of a larger redesign of the motor gasoline price survey, which includes frame research, sample design, and statistical systems development. Prior to EIA, Maura held roles at Mathematica Policy Research, Abt SRBI, and the Institute for Social Research at the University of Michigan. She is a graduate of the Michigan Program in Survey Methodology.


June 1, 2018

Title: Augmenting surveys: Willingness to collect data using smartphone sensors

Bella Struminskaya (Utrecht University, the Netherlands)


Smartphones provide researchers with an opportunity to collect data through built-in sensors such as GPS and accelerometers to study physical movements, and passively collect data about browsing history and online behavior, app usage, call and text messages logs.

Supplementing surveys in such ways can improve data quality by reducing survey errors due to recall and social desirability of self-reports, and reduce respondent burden. Furthermore, using smartphone sensors for data collection can provide richer data about human behavior, especially in situations where no self-report is possible. However, respondents have to be willing to use their smartphone sensors to collect data. If willing respondents differ from nonwilling respondents, results based on passive measurement might be biased. In this talk I will review empirical evidence available from recent studies on willingness to collect data passively using smartphones, provide insight into factors that influence willingness based on empirical studies in Germany and the Netherlands and briefly introduce ongoing projects about smartphone sensor data collection carried out in collaboration of Utrecht University and Statistics Netherlands (CBS). Given the novel and ongoing nature of this research, feedback from seminar participants is highly encouraged. We will discuss the issues which relate to the current practice of collecting smartphone sensor measurements (e.g., wording of the requests, privacy considerations) as well as how using sensors and apps for the research purposes might relate to the everyday use of smartphone sensors and avenues for future research.


Bella Struminskaya is an Assistant Professor of Methods and Statistics at Utrecht University, the Netherlands. She is part of the Data Collection Innovation Network (WIN), a collaboration between Utrecht University and Statistics Netherlands. Prior to joining Utrecht University she worked as a senior researcher at GESIS - Leibniz Institute for the Social Sciences in Mannheim, Germany, where she consulted researchers on implementation of online and mixed-mode surveys and helped set up the GESIS Panel, a probability-based general population panel in Germany.  She is a board member of the German Society for Online Research, where she co-organizes the GOR conference that focuses on online research, big data, and politics & communication. Bella has published on the issues of data quality in online panels, mobile web surveys, and panel conditioning. Her current research focuses on nonresponse and measurement errors in panel surveys and smartphone sensor data collection.