Working Papers
Exploit or Explore? An Empirical Study of Resource Allocation in Research Labs. 2023. Submitted.
[ Abstract | Latest Version + Appendices (December 2023) | October 2022 Version | October 2022 Version Appendices ]
Balancing exploitation and exploration in resource allocation under incomplete information is a classic problem in operations management theory. Yet little research has empirically studied how and how well decision-makers make the exploitation-exploration tradeoff in a complex real-world situation. This paper empirically studies how a group of large publicly funded research labs traded off the exploitation of safe projects to maximize short-term productivity versus the exploration of high-variance projects to acquire information and improve long-term productivity. Using granular data on the allocation of almost one million input bundles to more than 300,000 research projects from 2000 to 2015, we model the resource allocation process as a multi-armed bandit and estimate a dynamic structural model to reveal how these labs balanced exploitation and exploration. We find the labs' decision model strongly resembles a simple Upper Confidence Bound (UCB) index. Estimates of the model’s free parameters suggest that the labs explored extensively. Counterfactual simulations show that exploration substantially increased the labs’ productivity---had they not explored, their output quantity would have decreased by 51%, and their citations would have decreased by 57%. Further simulations demonstrate that the labs' decision model outperformed popular alternative allocation models, including the Gittins Index, Thompson Sampling, and Explore-Then-Commit. Additionally, processes that promoted information utilization during allocation contributed to better outcomes. Had the labs not collected and performed data analytics on the information revealed during exploration, they would have saved 3% of funding but lowered output quantity by 7% and citations by 9%.
Examining Selection Pressures in the Publication Process Through the Lens of Sniff Tests (with Christopher Snyder). 2023. Accepted, Review of Economics and Statistics.
[ Abstract ]
The increasing demand for empirical rigor has led to the growing use of auxiliary tests (balance, pre-trends, over-identification, placebo, etc.) to help assess the credibility of a paper's main results. We dub these ``sniff tests'' because rejection is bad news for the author and standards for passing are informal. We use these sniff tests---a sample of nearly 30,000 hand collected from scores of economics journals---as a lens to examine selection pressures in the publication process. We derive bounds under plausible nonparametric assumptions on the latent proportion of significant sniff tests removed by the publication process (whether by p-hacking or relegation to the file drawer) and the proportion whose significance was due to true misspecification, not bad luck. For the subsample of balance tests in randomized controlled trials, we find that the publication process removed at least 30% of significant p-values. For the subsample of other tests, we find a that at least 40% of significant p-values indicated true misspecification. We use textual analysis to assess whether authors over-attribute significant sniff tests to bad luck.
Upgraded Software and Embedded Improvements: A Puzzle of User Heterogeneity (with Raviv Murciano-Goroff and Shane Greenstein). 2023. Major Revision, Management Science.
[ Abstract ]
The rise in cyberattacks over the past two decades spurred interest in policies improving cybersecurity. One focus of that research is how to create incentives for software vendors to release updates fixing vulnerabilities in their software. An important consideration that has received far less attention in this literature is understanding if software users install available software updates promptly and the factors that may increase or decrease their responsiveness. In this paper, we empirically investigate the propensity of firms to install software updates on the servers running their websites. We compiled a dataset tracking the server software used by over 150,000 medium and large firms in the United States to host their websites between 2000 and 2018. Treating the discovery of security vulnerabilities in the server software as quasi-natural experiments, we examine if and when firms update their server software after the vendors of that software disclose vulnerabilities. We uncover widespread usage of software with severe security vulnerabilities, with nearly 76% of the firms analyzed forgoing installing software updates that fixed severe security vulnerabilities found in their software for at least six months after the release of such updates. Using hazard model analysis that accounts for firms having different organizational routines for updating, we document that usage of cloud-based platforms for hosting websites can decrease the time to installing updates, that technical complexity on sites slows updating, and that the disclosure of severe vulnerability fixes in software updates does not jolt firms into installing them. Finally, we discuss how the relative inattentiveness of firms to act on software update releases should be incorporated into the design of cybersecurity policies.
Work in Progress
Demand Fluctuations and Supply Coordination in Semiconductor Manufacturing (with Audrey Tiew).
[ Abstract ]
We study how supply capacity coordination can reduce social inefficiency from demand uncertainty and market power in the context of the semiconductor manufacturing industry. Market power generates misalignment between firm profit-maximizing capacity investments and welfare-maximizing capacity investments. To quantify the extent of this inefficiency and explore how various forms of supply coordination can mitigate it, we estimate a static structural model of semiconductor demand and a dynamic model of supply-side investment in technology and capacity. The data we have assembled to perform this exercise are, to our knowledge, the most comprehensive data on the industry in academic research. We obtain: (i) detailed proprietary buyer-level product demand data, covering around 20% of world orders, from 2004 to 2015, and (ii) proprietary world-wide, plant-level technology and capacity investment in semiconductor manufacturing plants from 1995 to 2015. We compare in counterfactual scenarios the relative efficacy of various forms of supply coordination (e.g., social planner, monopoly manufacturer, coordination on technology and capacity investment but competition in product market) in reducing inefficiency.
Publications
Hidden Software and Veiled Value Creation: Illustrations from Server Software Usage (with Raviv Murciano-Goroff and Shane Greenstein). 2021. Research Policy 50 (9): 104333.
[ Abstract | Publisher’s Version ]
How do you measure the value of a commodity that transacts at a price of zero from an economic standpoint? This study examines the potential for and extent of omission and misattribution in standard approaches to economic accounting with regards to open source software, an unpriced commodity in the digital economy. The study is the first to follow usage and upgrading of unpriced software over a long period of time. It finds evidence that software updates mislead analyses of sources of firm productivity and identifies several mechanisms that create issues for mismeasurement. To illustrate these mechanisms, this study closely examines one asset that plays a critical role in the digital economic activity, web server software. We analyze the largest dataset ever compiled on web server use in the United States and link it to disaggregated information on over 200,000 medium to large organizations in the United States between 2001 and 2018. In our sample, we find that the omission of economic value created by web server software is substantial and that this omission indicates there is over $4.5 billion dollars of mismeasurement of server software across organizations in the United States. This mismeasurement varies by organization age, geography, industry and size. We also find that dynamic behavior, such as improvements of server technology and entry of new products, further exacerbates economic mismeasurement.
The Impact of the General Data Protection Regulation on Internet Interconnection (with Bradley Huffaker, kc claffy, and Shane Greenstein). 2021. Telecommunications Policy 45 (2): 102083.
[ Abstract | Publisher’s Version ]
The Internet comprises thousands of independently operated networks, interconnected using bilaterally negotiated data exchange agreements. The European Union (EU)'s General Data Protection Regulation (GDPR) imposes strict restrictions on handling of personal data of European Economic Area (EEA) residents. A close examination of the text of the law suggests significant cost to application firms. Available empirical evidence confirms reduction in data usage in the EEA relative to other markets. We investigate whether this decline in derived demand for data exchange impacts EEA networks' decisions to interconnect relative to those of non-EEA OECD networks. Our data consists of a large sample of interconnection agreements between networks globally in 2015–2019. All evidence estimates zero effects: the number of observed agreements, the inferred agreement types, and the number of observed IP-address-level interconnection points per agreement. We also find economically small effects of the GDPR on the entry and the observed number of customers of networks. We conclude there is no visible short run effects of the GDPR on these measures at the internet layer.
Do Low‐Price Guarantees Guarantee Low Prices? Evidence from Competition between Amazon and Big‐Box Stores. 2017. Journal of Industrial Economics 65 (4): 719-738.
[ Abstract | Publisher’s Version ]
It has long been understood in theory that price-match guarantees can be anticompetitive, but to date, scant empirical evidence is available outside of some narrow markets. This paper broadens the scope of empirical analysis, studying a wide range of products sold on a national online market. Using an algorithm that extracts data from charts, I obtain a novel source of data from online price trackers. I examine prices of goods sold on Amazon before and after two big-box stores (Target and Best Buy) announced a guarantee to match Amazon's prices. Employing both difference-in-difference and regression-discontinuity approaches, I robustly estimate a positive causal effect of six percentage points. The effect was heterogeneous, with larger price increases for initially lower-priced items. My results support anticompetitive theories which predict price increases for Amazon, a firm that did not adopt the guarantee, and are consistent with plausible mechanisms for the heterogeneous impact.