At the turn of the 21st century, academics, civil society organizations, and governments hailed the promise of the Internet to eliminate any centralized control over speech. A few short decades later, however, this tech utopianism has disappeared. Dominant social media platforms have become de facto gatekeepers of global information and communication flows, giving private entities the ability to moderate the speech of billions of people. As social media companies gained this power, panic about the virality and volume of objectionable content online grew. The internet was painted as a sort of “Wild West” of toxic speech, resulting in monetary, reputational, and regulatory pressure on platforms to remove broad categories of content – including hate speech. Governments around the world also began exploring ways to intervene in platforms’ moderation practices, including by requiring platforms to remove certain types of specified content.
Despite these developments, however, to date, there has been no cross-platform, cross-temporal, systematic analysis of the way that platforms treat hate speech. What categories of content do platforms’ hate speech policies cover? How have platforms’ hate speech policies changed since their initial inception? Do platform hate speech policies align with international human rights law, given many of the major social media companies have publicly committed to respect human rights, under the United Nations (UN) Guiding Principles on Business and Human Rights? This report seeks to answer those questions. To do so, we collect original data on the hate speech policies of eight major platforms since their founding: Facebook, Instagram, Reddit, Snapchat, TikTok, Tumblr, Twitter, and YouTube. We then analyze how the policies have changed in scope over time, both within each individual platform and across all eight, and the extent to which they accord with Articles 19 and 20(2) of the International Covenant on Civil and Political Rights (ICCPR).
The results demonstrate a substantial increase in the scope of the platforms’ hate speech policies over time, both in the content and the protected characteristics covered. Platforms have gone from prohibiting the promotion of hatred or racist speech, in the mid-aughts and early 2010s, to introducing prohibitions on a long list of potential forms of hate speech, including harmful stereotypes, conspiracy theories, and curses targeting protected groups, over the past several years. In addition, the average number of protected characteristics listed in platform policies has more than doubled since 2010, with platforms protecting identities as wide-ranging as caste, pregnancy, veteran status, and victims of a major event. These developments do not align with the international human rights standards that most of the analysed platforms, with the exception of Tumblr and Reddit, have committed to respect under the UN Guiding Principles for Business and Human Rights. In particular, the scope creep of platforms’ hate speech policies goes far beyond the mandatory prohibition on hatred in Article 20(2) of the ICCPR. Moreover, the often vague nature of many of the platforms’ policies falls afoul of the requirement that restrictions on freedom of expression and access to information comply with the strict requirement of “legality” in ICCPR Article 19(3).
While current restrictions on researcher access to platform data make it impossible to causally identify the impact of this scope creep, we document many cases in which hate speech policies, when erroneously and inaccurately enforced, actually led to the inadvertent repression of minority speech. Most platforms argue hate speech can silence minority voices, but the non-exhaustive list of examples included in this report raise questions about the extent to which hate speech policies achieve their objectives. Moreover, the findings of this report challenge the prevailing narrative that platforms have been indifferent to hate speech and that social media constitutes an “unregulated Wild West” where hatred is allowed to spread freely. In fact, viewed from a human rights perspective, there are strong reasons to believe that platforms tend to err on the side of restrictions, rather than expression, when formulating hate speech policies.
Addressing hate speech is not an easy task for platforms, given they have diverse, global user bases with varying norms surrounding hate speech, face varied domestic laws that address the topic, and must rely on artificial intelligence to moderate the unprecedented amount of speech they host daily. However, the status quo approach to hate speech at all eight of the analyzed platforms goes far beyond globally accepted norms surrounding legitimate restrictions on freedom of expression, despite most of the platforms publicly committing to uphold these standards. We therefore present two potential complementary alternatives to this status quo – tying hate speech policies to international human rights law (IHRL) and/ or decentralizing content moderation – and discuss their benefits and drawbacks. Ultimately, however, we believe both paths forward are preferable to the status quo and recommend platforms adopt one or both. Under the former option, platforms would prohibit hate speech consistent with Articles 19 and 20 (2) of the International Covenant on Civil and Political Rights. Decentralizing content moderation would involve allowing third parties to develop their own filters for content, which users could choose between based on their own values and tolerance levels. A combination of the two might look like platforms allowing third parties to develop their own content moderation and curation systems but requiring that all of these still abide by IHRL standards when it comes to hate speech.