Hate is a scourge of the Internet, inflicting harm on people who are targeted and – when left unchallenged – raising fundamental questions about social justice and equality.

From the Christchurch Massacre in New Zealand to the Capitol Riots in the US, the past few years have demonstrated the need for concerted action to halt the spread of toxic online content and address its root causes. Regulation is a powerful force for galvanizing such efforts, establishing a framework and the right incentives for platforms to take appropriate action.

Last week, Ofcom published its guidance on new EU legislation to tackle harmful online content (including hate speech), the updated Audiovisual Media Services Directive (AVMSD). The AVMSD is aimed at Video-Sharing Platforms: online services which allow users to upload and share videos publicly, such as YouTube, Snapchat and Parler, as well as Facebook and Twitter. It requires platforms to take “appropriate measures” to protect their users from “incitement to violence or hatred”, bringing them in-line with existing requirements for traditional terrestrial services and Video-on-demand platforms. We were commissioned by Ofcom to write a report to inform their guidance, exploring a range of issues in online hate.

Most platforms use content moderation to keep their users safe. However, the coverage and detail of moderation varies across different platforms, from the limited moderation boasted by platforms such as Gab and Bitchute, through to the more comprehensive efforts of the large platforms. The AVMSD establishes minimal requirements for platforms to remove illegal online hate, and creates space for them to tackle hate that is legal but violates their Terms of Service. As such, it is likely to mostly affect the smaller and less well-moderated platforms rather than bigger platforms which already address this content.

Despite the ubiquity of content moderation systems, numerous concerns have been raised about their performance, fairness, robustness, explainability and scalability. Addressing these concerns will require critical reflection on how content moderation is designed and implemented – including consideration of whether other approaches, such as enabling counter speech or improving media literacy, could be more effective in keeping users safe. As a starting point, given its likely continued used across the industry, here we explain four key steps for creating a moderation system to tackle online hate.

1. Characterise online hate

First, platforms need to provide a clear account of online hate, clearly establishing where the line falls between hate and non-hate. A typology with different subtypes may need to be constructed (for instance, Facebook has three ‘tiers’ of hate). Alongside a definition, platforms should give clear examples, rationales and principles. Defining online hate is difficult work, and platforms should engage with civil society organisations, experts and victims of hate to make sure that their characterisation of online hate is fit for purpose.

2. Identify online hate

Second, platforms need to develop strategies to identify hate. Three planks form the basis of most content moderation processes for identifying online hate: (a) User reports, (b) Artificial Intelligence (AI) and (c) human review. How they are combined will vary across platforms, depending on their expertise, infrastructure and budget. There are inherent limitations in only using humans to moderate content (i.e., it is time consuming, expensive, can be inconsistent, and has the potential to inflict harm on the moderators). However, at the same time, AI can struggle with subtle, context-specific and changing forms of hate, and may lack coverage of important targets, such as victims of East Asian prejudice. AI is not a silver bullet and should supplement rather than supplant the use of human moderators.

3. Handle online hate

Third, content identified as hateful needs to be handled with an intervention. Public discourse often focuses on the effects of bans but, in practice, platforms use many other interventions. We identified 14 moderation strategies available to VSPs, each of which imposes different levels of friction on users’ activity. These range from hosting constraints, such as banning or suspending users and their content, through to engagement constraints, such as limiting the number of times that users can like or comment on content. Imposing any sort of friction risks impinging on users’ freedom of expression and privacy, and it is important that the degree of friction is always proportionate to the harm that is likely to be inflicted.

4. Enable users to appeal decisions

Fourth, online platforms should create a robust and accessible review procedure so that users are able to challenge moderation decisions. This is vital given that all content moderation systems will inevitably make some mistakes, applying unfair and unwarranted levels of friction to content (or, conversely, not doing anything to stop genuinely harmful content). Transparency in moderation is key for building trust with users and ensuring that outcomes are fair and proportionate.

Regulation such as the AVMSD (as well as the forthcoming Online Harms Bill in the UK) mark an important step towards fostering safer, more accessible and more inclusive online spaces. However, we must recognise that whilst necessary, improving content moderation is far from sufficient. Online hate will not be ‘solved’ by any one initiative alone but, instead, requires sustained and deep engagement from a range of stakeholders, particularly those who have been targeted by online hate.

Bertie VidgenEmily Burden and Professor Helen Margetts of The Alan Turing Institute recently published a report on online hate in the context of the requirements of the revised Audiovisual Media Services Directive for video sharing platforms.

This post originally appeared on the LSE Media Policy Project blog and is reproduced with permission and thanks.