Methods: Databases
Fields: Economics

Project Lead: Charlie Manzanares (Economics)
eScience Liaisons: Andrew Whitaker, Daniel Halperin


A graph depicting RAM consumption

RAM consumption during fall incubator

Since 2005, the U.S. airline industry has experienced the most dramatic merger activity in its history, which has reduced the number of major carriers in the U.S. from eight to four. My project seeks to provide novel estimates of changes in consumer and producer welfare in the U.S. due to these mergers.

To do so, I seek to estimate a dynamic model of route competition using the entire DB1B dataset, which is a 10% sample of all airline tickets in the U.S. from 1993 on, provided by the U.S. Department of Transportation. This dataset is large, consisting of roughly 5 million observations per quarter.

Further, in order to estimate parameters of the dynamic game, I use a simulation and estimation approach, which requires increasing the size of the DB1B dataset to accommodate routes offered by carriers that may not exist in the dataset but that may have existed if these mergers were prevented.

This data augmentation step increases the number of observations to 11 million per quarter. With this dataset, running my simulation using the R programming language is computationally infeasible on my laptop. The eScience Fall 2014 incubator project consists of creating software that will allow my simulation to run in parallel on an Amazon EC2 instance, drastically speeding up the computations and allowing me to complete multiple iterations of my simulation.

The tasks consist of a 1) data augmentation step (DA), 2) value function simulation and estimation step (VFE), and 3) counterfactual simulation step (CS).

Click here to read the project’s full summary.