Tackling misconceptions

This section clarifies several data protection issues to address misconceptions highlighted in the consultation responses. Some of them are about generative AI, while others are about AI (including ‘narrow’ AI) or data protection in general. We have set out existing ICO positions to offer clarity.

1) The “incidental” or “agnostic” processing of personal data still constitutes processing of personal data. Many generative AI developers claimed they did not intend to process personal data and that their processing of that data was purely incidental. Our view is clear: data protection law applies to processing of personal data (which includes special category data), regardless of whether this is ‘incidental’ or unintentional.

2) Common practice does not equate to meeting people’s reasonable expectations. Organisations should not assume that a certain way of processing will be within people’s reasonable expectations, just because it is seen as “common practice”. This applies particularly when it comes to the novel use of personal data to train generative AI in an invisible way or years after someone provided it for a different purpose (when their expectations were, by default, different).

3) “Personally identifiable information” (PII) is different to the legal definition of “personal data”. Many organisations focus their generative AI compliance efforts around PII. However, to ensure compliance in the UK they should be considering processing of any “personal data” (which is a broader and legally defined concept in the UK). Organisations must not undertake compliance based on a fundamental misunderstanding or miscommunicate their processing operations.

4) Organisations should not assume that they can rely on the outcome of case law about search engine data protection compliance when considering generative AI compliance. A few respondents sought to rely on these case outcomes, arguing that as the initial collection of data was substantively the same (ie crawling the web), the decisions should also apply to the generative AI context. However, while we can see there are similarities in terms of data collection, there are key differences which means that the logic of these decisions may not be applicable. For example, while a search engine intends to index, rank and prioritise information and make this available to the public, generative AI goes beyond this. It synthesises information and creates something new in its outputs. Traditional search engine operators also enable people to exercise their rights, in particular the right to erasure through ‘delisting’. The current practices of generative AI developers make it difficult for people to do the same.

5) Generative AI models can themselves have data protection implications. Some developers argued that their models do not “store” personal data. Our 2020 guidance on AI and data protection stated that AI models can contain personal data.¹⁸ In the generative AI consultation chapters, we explained that generative AI models may embed the data they have been trained on in a form that may allow their retrieval or disclosure by transmission. In particular, this may have implications for open-access models. This needs further research to understand when and how this risk may materialise. We intend to explore this issue in more detail in future, taking account of ongoing developments in the technological and academic spaces.

6) The ICO cannot determine or provide guidance on compliance with legal requirements which are outside the scope of our remit (ie data protection and information law). There was a perception by some respondents that the lawfulness principle¹⁹ meant that we could provide views or guidance on lawfulness under regimes other than data protection. Some also thought that data protection could be a useful lever to address issues within other regulatory remits. To be clear, data processing that is unlawful because it breaches other legal requirements (such as Intellectual Property law) will also be unlawful under data protection law. However, that does not mean we will or can be the arbiter of what is lawful under legislation outside of our remit.

7) There is no ‘AI exemption’ to data protection law. Some of the respondents argued that data protection law should not complicate generative AI development. While we support responsible AI development and deployment, it is important for organisations to be aware that there are no carve-outs or sweeping exemptions for generative AI. If an organisation is processing personal data, then data protection law will be applicable. We encourage organisations that are uncertain about compliance to adopt a “data protection by design approach”, considering compliance issues before they start processing.

18 We said that model inversion and membership inferences show that AI models can inadvertently contain personal data: How should we assess security and data minimisation in AI?

19 Principle (a): Lawfulness, fairness and transparency