Gabriel Nicholas, Michael Weinberg

In the Wild West internet of the 1990s and 2000s, only the scrappiest, most innovative companies could survive. Today, some of those that survived and thrived have grown into platforms used by billions, difficult to avoid and hard to leave. Regulators, policymakers, and the public at large worry that these large platforms may now have the ability to freeze out competitors and stifle innovation.

Data portability is often suggested as a tool to counteract the power of large platforms. In its simplest form, data portability is a user’s ability to download her data from a platform in a format that allows her to use it somewhere else. At least in theory, this lets users bring their data to new services outside the control of the original platform and helps competitors jump-start new products. A robust data portability system might allow regulators to contain the power of large platforms without having to take the drastic step of breaking them up.

 This theory is especially attractive in the context of services that rely on network effects, such as social networks. Users have years of conversations, shared photos, and connections with others on existing platforms. Being forced to leave that information behind would create a significant disincentive to jump to a competing platform, no matter how much better it is. Data portability allows users to bring their history somewhere new, even if they leave or delete their data from another platform.

The Key Question

However compelling in theory, few have investigated whether competitors can actually use ported data to create or grow competing platforms. This gap is particularly troublesome because we found no competitive products built on ported data, despite the fact that many large platforms have enabled users to export their own data for years. For example, Facebook has allowed users to download their data since 2010—well before current competition concerns emerged, and long enough ago for a competitor built on ported Facebook data to emerge. Still, no such competitor has emerged.

If data portability can fuel competition, and data portability has been available for almost a decade, where are the competitors built on data portability? And what does this absence mean for regulators considering using data portability as a competitive measure?

To understand the role that data portability can play in creating new, innovative services, we put real data in the hands of real competitors to see what they could do. We are starting with Facebook because we believe that the data related to social networks present some of the biggest challenges to a portability-based approach. We also focused our investigation on one-off data exports, as opposed to continuous integration via API, because of concerns related to the sustained availability of continuous integration. We hope to be able to examine other types of platforms in the data portability context in the future.

For this project, we exported and anonymized user data from Facebook’s Download Your Information tool and brought it to individuals in the New York City tech community. We asked a range of people, from junior engineers to serial C-level executives, how they would use this data to build new products to compete with Facebook. We asked them about the data’s strengths and weaknesses, and how it might be improved to make it more useful for potential competitors. We asked them why they were not already using this data to build their services, and what kinds of changes might allow them to do so.

This exploration is important because data portability is such an attractive tool in the regulatory toolbox. If data portability really can allow new services to grow and coexist with today’s large platforms, regulators, the public, and the platforms themselves could potentially avoid the dramatic process of breaking the platforms into smaller entities. However, if data portability is not a viable path for competition and innovation, debating the details of data portability schemes could serve as a distraction from other, more effective means of addressing concerns with large platforms.

What We Learned About the Data

In our discussions, interviewees struggled to come up with new, competitive products they could build from, or meaningfully grow with, ported Facebook data. This suggests that regulators should not assume that competitors will be able to use ported data to build innovative products and services. An over-reliance on data portability may distract from more effective tools for addressing concerns with large platforms. We came to this conclusion based on some key limitations our interviewees ran into about how they could use ported Facebook data create new products:

You cannot replicate Facebook with exported Facebook data. Facebook allows a user to export all of the data she explicitly shared with Facebook. That includes photos she uploaded, events she attended, and comments in group discussions she made. However, Facebook does not allow users to export the context that data was shared into. For example, while a user can download the posts she made in a group discussion, she cannot access the data required to reconstruct the full conversation or even the identity of other participants. She also cannot access the inferences Facebook has drawn from her data to build and improve its own service. As a result, trying to use exported user data to reproduce Facebook would be like trying to use furniture to reproduce the office building it came from.

Facebook data is best suited for building Facebook. The data Facebook collects is useful to a service like Facebook. That means it is best suited to build another social network that monetizes insights from user data, and ill suited to building a radically different service. The products that could come out of ported Facebook data will probably bear a striking resemblance to Facebook or one of its features, and are less likely to address a new need or be truly innovative.

Even if it were possible to build a competitor similar to Facebook, it may not be desirable. Ported data is simultaneously insufficient to replicate Facebook and too tailored to Facebook to be useful for much else. Even if neither of these observations were true, there may be reason for concern about the kind of innovation Facebook data might encourage. To the extent that exported data might be useful for building a new platform, that platform is mostly likely to be based on invasive, highly targeted advertising. Regulators and consumers are increasingly scrutinizing this type of business model. It seems unlikely that a new surveillance-based advertising network would be welcomed by those expressing increasing concern about Facebook itself.

What This Suggests for Policymakers

None of this means that data portability should be abandoned as a regulatory tool in every case. It does, however, suggest that looking to data portability as the primary way to address competition concerns related to large social networking platforms would be a mistake, and it raises concerns about its use in the context of large platforms more generally.

When considering implementing data portability regulations, policymakers should weigh these important factors:

Privacy and competition concerns are in tension when it comes to social network data. Social networks connect large numbers of users. When one of those users decides to export her data, a platform must define the frontiers of where her data ends and another user’s begins. That decision can be heavily influenced by the priorities articulated by regulators. A data portability program designed to maximize competition would allow users to export data that includes entire comment threads (not merely the user’s contribution), the identities of their friends, and data uploaded by others that relates to the exporting user (for example, a photo of the exporting user’s face, taken by someone else). This would make it easier for the exporting user to replicate her experience and reconstruct her social network on a new platform.

Conversely, a data portability program designed to maximize user privacy would strictly limit the types of third-party data that she could export. Her friends did not necessarily consent to the data export, so she could not export their names, their photos, or even their sides of a private conversation. Such a regime would be much more respectful of the privacy of non-exporting users. It would also make the data much less useful for competitors.

While there are ways to balance these competing design priorities, in the context of social networks they appear to be fundamentally in tension. Policymakers need to understand which priority they are elevating, and the consequences of that decision.

Data portability can be useful in select contexts. There may be domains entirely disconnected from social networking, such as music streaming or fitness tracking, where a well-designed portability regime could encourage competition. Data portability can also facilitate the concept of data ownership—a value that may have importance independent of competitive concerns.

Data portability may be a distraction in the competition debate. Data portability has been the subject of intense focus by both tech companies and policymakers. However, it may be that the type of data portability that is the focus of those discussions—and of this paper—is simply a poor mechanism to increase competition online. If that is the case, time spent debating specific aspects of a given data portability regime may be better spent considering different types of approaches to competition concerns.

