Working Papers
Exploit or Explore? An Empirical Study of Resource Allocation in Research Labs. 2023. Reject and resubmit, Management Science.
[ Abstract | Latest Version + Appendices (December 2023) | October 2022 Version | October 2022 Version Appendices ]
Balancing exploitation and exploration in resource allocation under incomplete information is a classic problem in operations management theory. Yet little research has empirically studied how and how well decision-makers make the exploitation-exploration tradeoff in a complex real-world situation. This paper empirically studies how a group of large publicly funded research labs traded off the exploitation of safe projects to maximize short-term productivity versus the exploration of high-variance projects to acquire information and improve long-term productivity. Using granular data on the allocation of almost one million input bundles to more than 300,000 research projects from 2000 to 2015, we model the resource allocation process as a multi-armed bandit and estimate a dynamic structural model to reveal how these labs balanced exploitation and exploration. We find the labs' decision model strongly resembles a simple Upper Confidence Bound (UCB) index. Estimates of the model’s free parameters suggest that the labs explored extensively. Counterfactual simulations show that exploration substantially increased the labs’ productivity---had they not explored, their output quantity would have decreased by 51%, and their citations would have decreased by 57%. Further simulations demonstrate that the labs' decision model outperformed popular alternative allocation models, including the Gittins Index, Thompson Sampling, and Explore-Then-Commit. Additionally, processes that promoted information utilization during allocation contributed to better outcomes. Had the labs not collected and performed data analytics on the information revealed during exploration, they would have saved 3% of funding but lowered output quantity by 7% and citations by 9%.
Navigating Software Vulnerabilities: Eighteen Years of Evidence from Medium and Large U.S. Organizations (with Raviv Murciano-Goroff and Shane Greenstein). 2024.
[ Abstract | NBER Working Paper | HBS Working Knowledge | NBER Digest]
How prevalent are severe software vulnerabilities, how fast do software users respond to the availability of secure versions, and what determines the variance in the installation distribution? Using the largest dataset ever assembled on user updates, tracking server software updates by over 150,000 medium and large U.S. organizations between 2000 and 2018, this study finds widespread usage of server software with known vulnerabilities, with 57% of organizations using software with severe security vulnerabilities even when secure versions were available. The study estimates several different reduced-form models to examine which organization characteristics correlate with higher vulnerability prevalence and which update characteristics causally explain higher responsiveness to the releases of secure versions. The disclosure of severe vulnerability fixes in software updates does not jolt all organizations into installing them. Factors related to the cost of updating, such as whether the software is hosted on a cloud-based platform and whether the update is an incremental change or a major overhaul, play an important role. Observables cannot easily explain much variation. These findings underscore the urgent need to incorporate organizations' relative (in)attentiveness to act on software update releases into the design of cybersecurity policies.
Work in Progress
Demand Fluctuations and Supply Coordination in Semiconductor Manufacturing (with Audrey Tiew).
[ Abstract ]
We study how supply capacity coordination can reduce social inefficiency from demand uncertainty and market power in the context of the semiconductor manufacturing industry. Market power generates misalignment between firm profit-maximizing capacity investments and welfare-maximizing capacity investments. To quantify the extent of this inefficiency and explore how various forms of supply coordination can mitigate it, we estimate a static structural model of semiconductor demand and a dynamic model of supply-side investment in technology and capacity. The data we have assembled to perform this exercise are, to our knowledge, the most comprehensive data on the industry in academic research. We obtain: (i) detailed proprietary buyer-level product demand data, covering around 20% of world orders, from 2004 to 2015, and (ii) proprietary world-wide, plant-level technology and capacity investment in semiconductor manufacturing plants from 1995 to 2015. We compare in counterfactual scenarios the relative efficacy of various forms of supply coordination (e.g., social planner, monopoly manufacturer, coordination on technology and capacity investment but competition in product market) in reducing inefficiency.
The Effect of U.S.-China Decoupling on Investment in High-Tech Manufacturing (with Audrey Tiew).
[ Abstract ]
High-tech manufacturing is often characterized by rapid technology turnover, frequent and substantial investment fixed-costs, and significant economies of scale. Recent national policies emphasizing domestic self-reliance in the U.S. and China highlight potential interactions between these industry characteristics and national security considerations. In this paper, we study the effect of U.S.-China decoupling on investments in contract manufacturing capacity for semiconductor chips. Our unique dataset combines a comprehensive sample of worldwide plant-level capacity investments and a representative sample of global contract manufacturing orders for semiconductors on a quarterly basis from 2004 to 2015. We use this data to estimate: (i) a static model of manufacturing contracts, and (ii) a dynamic structural model of manufacturers' capacity investment decisions. Using counterfactuals, we explore a large global semiconductor manufacturer's potential responses to geographically specific national policies affecting investment incentives.
Publications
Examining Selection Pressures in the Publication Process Through the Lens of Sniff Tests (with Christopher Snyder). 2023. Forthcoming, Review of Economics and Statistics.
[ Abstract | Publisher’s Version]
The increasing demand for empirical rigor has led to the growing use of auxiliary tests (balance, pre-trends, over-identification, placebo, etc.) to help assess the credibility of a paper's main results. We dub these ``sniff tests'' because rejection is bad news for the author and standards for passing are informal. We use these sniff tests---a sample of nearly 30,000 hand collected from scores of economics journals---as a lens to examine selection pressures in the publication process. We derive bounds under plausible nonparametric assumptions on the latent proportion of significant sniff tests removed by the publication process (whether by p-hacking or relegation to the file drawer) and the proportion whose significance was due to true misspecification, not bad luck. For the subsample of balance tests in randomized controlled trials, we find that the publication process removed at least 30% of significant p-values. For the subsample of other tests, we find a that at least 40% of significant p-values indicated true misspecification. We use textual analysis to assess whether authors over-attribute significant sniff tests to bad luck.
Hidden Software and Veiled Value Creation: Illustrations from Server Software Usage (with Raviv Murciano-Goroff and Shane Greenstein). 2021. Research Policy 50 (9): 104333.
[ Abstract | Publisher’s Version ]
How do you measure the value of a commodity that transacts at a price of zero from an economic standpoint? This study examines the potential for and extent of omission and misattribution in standard approaches to economic accounting with regards to open source software, an unpriced commodity in the digital economy. The study is the first to follow usage and upgrading of unpriced software over a long period of time. It finds evidence that software updates mislead analyses of sources of firm productivity and identifies several mechanisms that create issues for mismeasurement. To illustrate these mechanisms, this study closely examines one asset that plays a critical role in the digital economic activity, web server software. We analyze the largest dataset ever compiled on web server use in the United States and link it to disaggregated information on over 200,000 medium to large organizations in the United States between 2001 and 2018. In our sample, we find that the omission of economic value created by web server software is substantial and that this omission indicates there is over $4.5 billion dollars of mismeasurement of server software across organizations in the United States. This mismeasurement varies by organization age, geography, industry and size. We also find that dynamic behavior, such as improvements of server technology and entry of new products, further exacerbates economic mismeasurement.
The Impact of the General Data Protection Regulation on Internet Interconnection (with Bradley Huffaker, kc claffy, and Shane Greenstein). 2021. Telecommunications Policy 45 (2): 102083.
[ Abstract | Publisher’s Version | VoxEU Column ]
The Internet comprises thousands of independently operated networks, interconnected using bilaterally negotiated data exchange agreements. The European Union (EU)'s General Data Protection Regulation (GDPR) imposes strict restrictions on handling of personal data of European Economic Area (EEA) residents. A close examination of the text of the law suggests significant cost to application firms. Available empirical evidence confirms reduction in data usage in the EEA relative to other markets. We investigate whether this decline in derived demand for data exchange impacts EEA networks' decisions to interconnect relative to those of non-EEA OECD networks. Our data consists of a large sample of interconnection agreements between networks globally in 2015–2019. All evidence estimates zero effects: the number of observed agreements, the inferred agreement types, and the number of observed IP-address-level interconnection points per agreement. We also find economically small effects of the GDPR on the entry and the observed number of customers of networks. We conclude there is no visible short run effects of the GDPR on these measures at the internet layer.
Do Low‐Price Guarantees Guarantee Low Prices? Evidence from Competition between Amazon and Big‐Box Stores. 2017. Journal of Industrial Economics 65 (4): 719-738.
[ Abstract | Publisher’s Version ]
It has long been understood in theory that price-match guarantees can be anticompetitive, but to date, scant empirical evidence is available outside of some narrow markets. This paper broadens the scope of empirical analysis, studying a wide range of products sold on a national online market. Using an algorithm that extracts data from charts, I obtain a novel source of data from online price trackers. I examine prices of goods sold on Amazon before and after two big-box stores (Target and Best Buy) announced a guarantee to match Amazon's prices. Employing both difference-in-difference and regression-discontinuity approaches, I robustly estimate a positive causal effect of six percentage points. The effect was heterogeneous, with larger price increases for initially lower-priced items. My results support anticompetitive theories which predict price increases for Amazon, a firm that did not adopt the guarantee, and are consistent with plausible mechanisms for the heterogeneous impact.