Andy Jones CS PhD student @ Princeton

Vertical integration in academic research

In corporate management and economics, vertical integration refers to a structure in which a company controls multiple stages of a product’s supply chain. Amazon is a quintessential example of successful vertical integration: over time, the business has come to control everything from manufacturing products, to building the website to sell them, to employing a fleet of drivers who deliver the products to customers.

I argue that vertical integration is a central – yet overlooked – aspect of academic research as well. While the incentives in academic research are quite different from those in a corporate environment, it’s useful to draw a parallel between both in terms of their economic principles.

For the sake of this post, let’s think of the “product” of academic research as a publication. (Importantly, however, we should emphasize that treating publications as units of production is not the most effective or healthy way to view research in general.)

In an academic setting, a key skill is to manage the entire research supply chain. There are many steps in the production pipeline: winning funding, collecting the data, writing the code to analyze the data, writing the paper, and even doing public relations, “advertising”, and speaking tours following the release of the paper. It is beneficial for an academic researcher to vertically integrate their skillset across these steps.

In a corporate environment, if a company doesn’t have the money, bandwidth, or expertise to control a certain step in the supply chain, they can outsource that step to another company or entity. For example, if Amazon doesn’t own factories for producing phone chargers, they can buy and resell them in bulk from a company that specializes in phone charger production.

Academic researchers, on the other hand, don’t have the option to shift the burden of specific production steps to another party. Academic researchers don’t have the ability or money to hire a marketing company to advertise their publications; instead, they need to publicize their work through public speaking and Twitter. Academic researchers (typically) can’t hire professional software engineers who will implement their ideas in production-quality code; instead, they have to write and maintain their own repositories (with highly variable success). I intentionally focus on the academic sector of research because research teams within corporations often have a more decentralized approach to research, as evidenced by the strong marketing (maybe too strong?) and software engineering teams at places like OpenAI, DeepMind, and Facebook.

As an academic, failing to vertically integrate oneself across these steps in the research production pipeline – or focusing too intently on just one or two steps – can be detrimental. For example, one commonly ignored step in the pipeline (especially for academics) is public relations. It’s tempting to view academia as a market where all participants have perfect information and full visibility into everyone else’s work, and where the best ideas always come to the forefront. However, this is not the case. Good ideas can lie dormant in a paper for years or decades without due recognition if it isn’t advertised. “Advertising” in this case refers to giving talks at conferences, posting tweets about the work, or talking to colleagues at other universities about it. Of course, it’s very possible to take the advertising step too far by over-selling one’s work, but I would argue that academics err on the side of under-selling on average.

As another example, consider the software engineering aspect of academic research. The code associated with research projects tends to be less usable and less documented compared to professional-level code written at tech companies. There are legitimate reasons for this: academics are already spread too thin and the incentives for writing crystal-clear research code aren’t so clear. However, having usable code associated with a research paper will make people more likely to try the code themselves and may ultimately result in more recognition and citations. Ignoring this step in the pipeline could result in a brilliant research idea being underappreciated due to the inability of other researchers to use the code.

While this perspective on academic research is perhaps disappointing or disheartening to some academics, I think it can be enlightening to view our work through the lens of more traditional economics. Researchers must be aware of every step in the pipeline, lest they miss one and do themselves a disservice. Also, having more transparency about the steps in the research production pipeline that are often deemphasized or hidden will hopefully level the playing field and allow for everyone to be a full participant in the academic economy.