What are profiling tools and what personal information processing do they involve?

In detail

What do you mean by profiling tools?
How are profiling tools used in trust and safety systems
What personal information processing do profiling tools involve?
How do profiling tools inform moderation decisions on online services?
Do profiling tools process special category information?
Is criminal offence information a relevant consideration?

What do you mean by profiling tools?

When we refer to profiling tools in this guidance, we mean trust and safety tools that use profiling to analyse aspects of a person’s characteristics, behaviour, interests or preferences.

Article 4(4) of the UK GDPR defines profiling as:

“any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse or predict aspects concerning that natural person's performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements”.

You are carrying out profiling if you collect and analyse personal information about users of your service to predict their characteristics or behaviour.

How are profiling tools used in trust and safety systems?

The use of profiling tools for online safety is relatively new. However, there is growing interest in how profiling can be used to support trust and safety operations. Some applications of profiling tools in trust and safety systems are well-established and in deployment, while others are undergoing development or testing.

Profiling tools can be used to support various trust and safety applications including detection of:

grooming behaviour;
terrorism and violent extremism;
bullying or harassment;
fraud and scams;
spamming; and
fake or inauthentic accounts.

Profiling tools may also be used to assign a rating or score to a user. Examples include ‘risk scores’ or ‘reputation scores’ that indicate how likely it is that an account is in violation of your terms of service or the likelihood of an account being a bot.

You might consider using profiling tools in your trust and safety systems to:

comply with your duties under the OSA;
enforce your terms of service or community guidelines; or
apply other trust and safety measures or processes.

You might use profiling tools in different ways. For example your tools might:

retain and continuously update the output of your analysis; or
produce a one-off ‘snapshot’ for use at a specific point in time.

The first use case involves an ongoing evaluation of someone. You may update the profiling periodically or continually in response to new information. For example, creating a ‘risk score’ indicating the likelihood that someone is a bot, which you periodically update as new user activity information becomes available for analysis.

The second use case is an evaluation at a particular point in time, even if you delete the information afterwards. For example, making a decision about someone ‘in the moment’ (eg when a user logs in to their account or sends a message to another user).

Whether or not you retain and update the outputs of your profiling tools or generate them for temporary use, you are still carrying out profiling if your tools analyse users’ personal information to evaluate their characteristics, attitudes or behaviours.

Profiling tools have the potential to be highly privacy intrusive due to the wide range of personal information they can use and generate and because of the decisions about users that the tools can make. Users may not understand how profiling works and how it affects them. They may not expect the scale of the processing undertaken by your profiling tools, and may not understand how their information is used to profile them.

Therefore, under data protection law, it is particularly important that you consider the impacts of profiling on your users’ privacy and whether the processing undertaken by your profiling tools is necessary and proportionate to achieve the aim you have in mind. See the section on How do we assess and mitigate the data protection risks involved in our use of profiling tools? for more information.

Given the intrusive nature of profiling tools, it is also important that you consider the transparency you provide to users about your use of them. You should consider whether it is appropriate for you to provide further transparency information about the processing involved in your tools, beyond the minimum requirements set out in article 13 of UK GDPR. But, this depends on your specific situation and context of deployment, and is about taking a proportionate approach to inform people about what you’re doing. See the section on How should we tell people what we’re doing? for more information.

Our research and analysis has found that services might consider profiling tools in cases where analysis of content alone is not sufficient to detect harms, or where user-generated content is not available for analysis. For example, analysis of user activity data may provide insights into users engaging in certain types of behaviour.

What personal information processing do profiling tools involve?

The workflow of a profiling tool can be broken down into four key stages that each involve processing personal information. These stages include:

input, which involves collecting or gathering information that is used as inputs into a profiling tool’s analysis;
analysis, which involves the tool analysing the input data, typically using AI or automated technologies;
output, where the tool produces an assessment of a particular user’s behaviour or characteristics; and
application of the output to a specific user, which can include taking moderation action on them.

The following sections discuss these stages in more detail.

Input stage

The action of selecting and gathering personal information for use as inputs into profiling tools is classed as processing personal information.

The categories of input data that a profiling tool uses depend on the tool and its purpose, but are usually selected based on the extent to which the data is an indicator for certain behaviours or characteristics. For example, the number of friend requests a user sends in a certain time period can be an indicator of whether they are likely to be a bot.

In trust and safety systems, some of the inputs that profiling tools use include:

user-generated content and associated metadata. For example, text, images, videos, links and audio content your users post, along with metadata, such as the time of the post and how other users engage with it;
data your users provide. This includes the personal information provided directly by a user when they register with your service, such as their email address, location and gender; and
data you observe. For example, user activity data like how they use your service, what groups they are members of, which other users they are friends with, acceptance or rejection of requests, and how and where they access your service.

It is important to note that in some cases input data also includes the personal information of users other than those whose behaviour or characteristics are being assessed. For example, if a tool analyses information about a user’s friends or connections on a service. Your tools might also process personal information of people who are not users of your service (eg if this is contained in the content your tools analyse).

You must comply with data protection law when you process personal information about anyone. This includes where the profiling activities you carry out about a specific person involve personal information of other people.

Analysis stage

Profiling tools typically involve technologies that can find correlations in datasets and make deductions about users, such as AI and automation. For example:

Machine learning. Machine learning models work by ‘learning’ the patterns in a dataset. The machine learning model can then be applied to new data to make predictions and classifications about people.
Rules-based systems. Rules-based systems involve use algorithms with pre-defined rules or conditions, set by human programmers. The rules are usually based on expert knowledge of trends and patterns in certain harms and behaviours. For example, an algorithm that is set to flag users who frequently message strangers then get blocked or ignored by them (which could indicate spamming behaviour).

Output stage

Profiling tools also create outputs, which are usually inferences, classifications or predictions about the likelihood of a user exhibiting certain characteristics or behaviours.

These outputs are likely to be personal information. This is because the outputs represent an evaluation of an identified user of your service. This information is usually recorded on a service’s internal systems or the user’s account, or both.

Application of the output

The personal information produced by a profiling tool is usually applied to a user, for example:

assigning a score to them; or
making a decision about whether to take moderation action, and if so, what type of moderation action.

Decisions about moderation action can take place alongside other trust and safety tools like content moderation. (See the section on How do profiling tools inform moderation decisions on online services? for more information).

Personal information is processed at this stage. For example:

using the output of a profiling tool’s analysis to apply a particular measure to a user (such as enforcing a user ban); and
recording information about the tool’s output or any actions taken on a user. For example, recording:
- the risk score that a tool generates;
- information about a user receiving a ‘strike’ or a ban; or
- a decision not to take moderation action against a user.

You must be able to demonstrate that personal information processing involved in the operation of your profiling tools:

is necessary and proportionate (see the section on see the section on How do we assess necessity and proportionality?); and
complies with the data protection principles.

Given the nature and volume of personal information you are likely to have available as a service provider, it is particularly important for you to show how you comply with the purpose limitation and data minimisation principles.

In addition, given the ‘behind the scenes’ nature of the processing, your users may not be aware of the extent to which you are processing their information for this tooling. This means it is also particularly important for you to show how you comply with the transparency principle. See the sections on How do we define our purposes for profiling?, How do we ensure data minimisation in our profiling tools? and How should we tell people what we're doing? for more information.

How do profiling tools inform moderation decisions on online services?

You might use profiling tools to decide whether to take moderation action on a user, or to inform what type of action to take.

When we discuss ‘moderation action’ in this guidance, we mean the action you take against a user or their content as part of your trust and safety processes.

In some cases, the output of your profiling tool may be the sole factor you use in deciding to take moderation action against users. In other cases, you may feed the outputs into, or combine them with, other online safety tools or processes before you make a decision. For example, you may combine the output of a profiling tool with the output of a content moderation tool to decide if a user has violated your terms of service.

Profiling tools can result in, or contribute to, different moderation actions, including:

service bans, where you ban a user from accessing your service temporarily or permanently. This can involve a ‘strike’ system that records a user’s violations of your community guidelines or content policies, and bans the user after a certain number of strikes;
feature blocking, where you block a user's access to certain features temporarily or permanently. For example, preventing them from posting content or interacting with other people's content while still allowing them to access other features of the service;
account restrictions, where you prevent a user from viewing or interacting with certain users or accounts;
content removal, where you remove content from your service or prevent its publication;
visibility reduction, where you take actions like preventing content being recommended or making it appear less prominently in users feeds; and
warnings or prompts, where you provide reminders about what behaviours or content are prohibited on your service or give users links to resources about certain issues or topics.

Decisions taken about users using profiling tools have the potential to produce significant impacts on them. If you are making solely automated decisions about users based on profiling, and these decisions have a legal or similarly significant effect on them, you must comply with additional requirements of data protection law. See the section on Does article 22 apply to our use of profiling tools? for more information.

Do profiling tools process special category information?

Profiling can involve processing special category information about people. This means personal information about a person’s:

race;
ethnic origin;
political opinions;
religious or philosophical beliefs;
trade union membership;
genetic data;
biometric data (where this is used for identification purposes);
health data;
sex life; or
sexual orientation.

The UK GDPR is clear that special category information includes not only personal information that specifies relevant details, but also personal information ‘revealing’ or ‘concerning’ these details.

Profiling tools may involve processing special category information both directly and by inference. This could be if:

your tools use special category information as inputs to support their evaluation about a user’s characteristics or behaviour. This includes if they involve user-generated content, and that content contains special category information about users. For example, posts on an online forum where people express their sexuality or political views; or
your tools infer characteristics about users that fall within the special categories of information.

You must determine whether you are likely to use any special category information (including inferences) to influence or support your activities in any way. If so, you are processing special category information.

Special category information needs more protection because it is sensitive. If you process it, you must identify a condition for processing under article 9 in addition to a lawful basis under article 6, although these do not have to be linked. This applies whether you’ve planned to process it or because you process it incidentally (eg it’s contained in any content that your profiling tools analyse). See the section on How do we use profiling tools lawfully? for more information about conditions for processing.

In some cases, you may be able to infer details about someone that fall into the special categories of information, even though you do not intend to make those inferences. For example, a human moderator reviewing the outcome of a tool’s behaviour analysis may look at a user’s activity on the service and be able to infer that they belong to a particular religious group. This may be the case even if the tool or the moderator is not seeking to make an inference about the user’s religion. If you do not intend to make that inference or treat the user differently as a result of the inference, you are not likely to be processing special category data.

You are processing special category information if:

your processing intends to make an inference linked to one of the special categories of information; or
you intend to treat users differently on the basis of an inference linked to one of the special categories.

This is the case whether or not the inference is correct.

Is criminal offence information a relevant consideration?

The UK GDPR gives extra protection to personal information about criminal convictions and offences or related security measures.

This includes personal information ‘relating to’ criminal convictions and offences. For example, it can cover suspicion or allegations of criminal activity. In this guidance, we refer to this information collectively as ‘criminal offence information’, although this is not a term used in the UK GDPR.

If you are using profiling tools, you must assess whether you are processing criminal offence information.

If your profiling tools involve processing criminal offence information, you must identify a condition for processing under article 9 of UK GDPR, as well as a lawful basis under article 6. See the section on How do we use profiling tools lawfully? for more information about conditions for processing.