How do we avoid an accidental breach when personal information is ‘hidden’ in documents?
-
Due to the Data (Use and Access) Act coming into law on 19 June 2025, this guidance is under review and may be subject to change. The Plans for new and updated guidance page will tell you about which guidance will be updated and when this will happen.
You are responsible for complying with your obligations under the UK GDPR and Data Protection Act 2018 (DPA 2018) and, where relevant, other information rights legislation, including the Freedom of Information Act 2000 (FOIA). Whilst we make every effort to make sure this guidance is accurate at the time of publication (31 July 2025), we make no guarantees or representations that it will remain up-to-date or ensure compliance. Where appropriate, seek further guidance or advice before disclosing information in the specific circumstances. If you would like to suggest improvements to this guidance, please leave us feedback.
In detail
- What is ‘hidden’ personal information?
- How do we avoid an accidental breach when personal information is hidden in documents?
- What do we need to consider when using software tools to help us check for hidden personal information?
- How do software tools designed to check documents help us avoid accidental breaches?
- How does converting documents to simpler formats help us identify hidden personal information?
- What ineffective techniques might people use to keep personal information secure?
- What are the risks of using ineffective techniques to keep personal information secure?
- How do we reduce the risk of people using ineffective techniques to keep personal information secure?
- What is metadata?
- Why is metadata a risk?
- How do we reduce the risk of metadata?
- What are embedded files and objects?
- Why are embedded files and objects a risk?
- How do we reduce the risk of embedded files and objects?
- What is ‘markup’?
- Why is markup a risk?
- How do we reduce the risk of markup?
What is ‘hidden’ personal information?
In this guidance, when we say personal information is ‘hidden’ we mean that it is not immediately obvious from looking at the document. You might hide information deliberately for various legitimate business purposes (eg by changing the formatting of a document), or you might wish to restrict access, control printed versions or simply alter how the information is presented.
Personal information can also be hidden in documents because you are simply not aware it exists (ie no one deliberately hides it from view, but it is just not obvious). For example, personal information may be embedded automatically within a document (eg metadata). You might also add personal information to a document that the person disclosing it does not notice (eg comments or tracked changes).
How do we avoid an accidental breach when personal information is hidden in documents?
You must:
- have appropriate data protection policies and procedures to help staff disclose documents securely and respond to accidental breaches effectively;
- keep personal information secure using appropriate methods (eg passwords and secure redaction techniques) (see How do we avoid an accidental breach when redacting information?); and
- comply with relevant obligations under information access and data protection legislation, if you need to remove personal information from a document or consider an appropriate format in which to disclose it.
You should:
- give staff appropriate data protection training about disclosing documents securely and how to report breaches, including induction and regular refresher training;
- check documents appropriately before disclosing them, considering the circumstances, including the risk of harm if the information was accidentally disclosed;
- know how to remove personal information that you cannot disclose and how to redact it effectively (see How do we avoid an accidental breach when redacting information?);
- avoid using ineffective techniques to keep information secure. For example, don’t:
- change the font colour to be the same as the background;
- cover information with objects; or
- format text to make it invisible.
Steps you could take include the following:
- Raise awareness in your organisation about the risks of accidentally disclosing documents containing hidden personal information.
- Use software functions, where available, to help you search for text that is the same colour as the background (eg the ‘Find and Replace’ text function in Microsoft Word and Excel).
- Use software tools (eg Document Inspector in Microsoft software), where available, that are designed to help you find (and remove where possible and appropriate) various types of personal information. For example, Document Inspector can help you find:
- information formatted as invisible;
- metadata (eg document properties, email routing information and EXIF (Exchangeable image file format) information in image files);
- embedded files and objects; and
- ‘markup’ (eg comments, ink annotations and tracked changes).
- If you want to retain a picture of an embedded object (eg a chart) but remove underlying information, consider whether it is an option to copy and paste the picture of the embedded object only into a new document (eg the ‘Paste Special’ feature in Microsoft software).
- Convert complex files to simpler formats (eg txt or csv files), where appropriate, to reveal all the displayable information in the document (see How does converting documents to simpler formats help us identify hidden personal information?).
- Check the file size is not larger than you would expect for the volume of information you intend to disclose.
- Use a retention schedule to help you identify when to remove or delete personal information permanently.
If you are disclosing a pdf document, there is information to help you do this securely in the further reading box. Redacting pdfs is also covered briefly in this guidance (see How do we avoid an accidental breach when redacting information?).
Further reading
- ICO guidance
- External guidance
How do software tools designed to check documents help us avoid accidental breaches?
To help you identify hidden personal information within documents, you could use software tools (eg Document Inspector in Microsoft Excel). However, you should still check documents yourself, even if you have used a software tool to help you.
Although such tools are helpful, keep in mind that they may not be able to identify all types of personal information that might be hidden in your document. For example, Document Inspector cannot detect text hidden by changing its colour or Microsoft Excel charts located in hidden columns.
If you want to remove any personal information identified, save a copy of the document first. The tool may be able to remove some personal information automatically, otherwise you need to remove it yourself. For example, Document Inspector cannot remove information that might stop your document from working properly. You could run the tool again before disclosing documents to help you identify any remaining issues.
How does converting documents to simpler formats help us identify hidden personal information?
To help you identify hidden personal information within documents, you could convert complex files to a simpler format. This may make it easier to identify hidden personal information because you can reveal all the displayable information in a document by converting to an appropriate format (eg csv or txt). Please note that converting to a pdf does not display all the information in the document.
If you decide to convert information, keep in mind that some formatting, information and features may be lost. For example, if you convert a Microsoft Excel spreadsheet to csv format, this removes the formatting, formula, macros and images. You can also only convert one Microsoft Excel worksheet at a time.
Once you have checked the converted document, you should consider an appropriate format for disclosure. This may not necessarily be the same as the converted format, depending on the circumstances. Factors to consider, where relevant, include:
- Under section 11 of FOIA and regulation 6 of EIR, public authorities must provide information in the requester’s preferred format, when requested, and where it is reasonably practicable.
- If requested information is held as a dataset as defined by section 11(5) FOIA, public authorities must make sure they disclose it in a format that is capable of re-use where this is reasonably practicable. A reusable format means that the dataset is machine-readable and based on open standards (eg csv or odf (open document format)).
- If a requester makes a SAR electronically, you must provide a copy of their personal information in a commonly used electronic format. You may choose the format, unless the requester makes a reasonable request for you to provide it in another commonly used format. However, you should not expect a requester to download software to access their personal information.
- You may also have accessibility obligations under other legislation. For example, the Equality Act 2010, the Disability Discrimination Act 1995 and the Public Sector Bodies (Websites and Mobile Applications) (No.2) Accessibility Regulations 2018.
Further reading
- ICO guidance
- External guidance
What ineffective techniques might people use to keep personal information secure?
People may try to hide personal information in documents to keep it secure using ineffective techniques, such as:
- changing the colour of the text or background (eg white text on a white background);
- covering information with objects (eg a white rectangle shape);
- moving information to the fringes of a document; or
- formatting information to be invisible or using a hidden text font effect (eg Microsoft Word).
What are the risks of using ineffective techniques to keep personal information secure?
Ineffective techniques may prevent information from being immediately obvious or appearing on a printed version of a document. However, the information is still contained in the electronic document. This creates a risk that people may disclose the document without realising the information is there. Recipients of a document can reveal the information easily (eg by changing the colour of the text or background).
When text is hidden using ineffective techniques, it may also be more difficult to identify using software tools. For example, while Document Inspector can detect text formatted to invisible or hidden, it cannot detect information that is hidden by changing the colour of text or covered by objects (see How do software tools designed to check documents help us avoid accidental breaches?).
How do we reduce the risk of people using ineffective techniques to keep personal information secure?
If you need to restrict access to information within a document, you must control access using appropriate methods (eg passwords and secure redaction techniques) (see How do we avoid an accidental breach when redacting information?). Hiding personal information using ineffective techniques (eg changing the colour of text or background) is not an appropriate way to keep it secure.
You should check documents appropriately for hidden personal information before disclosing them. You should know how to check for and remove personal information hidden by ineffective techniques. There are different ways to do this. For example, you could check for text of a certain colour (eg white text on a white background) using the ‘Find and Replace’ text function in Microsoft Word and Excel to search for a specific colour of text.
Software tools designed to help you check documents for personal information may not be able to detect personal information hidden using certain types of ineffective techniques. However, to help you check for hidden information, you could convert the file to a simpler format (see How does converting documents to simpler formats help us identify hidden personal information?).
If you only want to control printing, you could consider whether you are able to set a print area. You can do this if you are using Microsoft Excel, for example.
What is metadata?
Metadata is information embedded within a document. It is often described as ‘data about data’. Metadata may be embedded automatically into a document when you create, edit and save it. You may also be able to add custom properties. Examples of metadata include:
- document properties (eg author, subject and title);
- information about the sender and recipient of an email and message delivery; and
- information about image files called EXIF data (Exchangeable image file format) (eg Global positioning system (GPS) coordinates identifying where a photograph was taken, or specific details about the device used).
Why is metadata a risk?
Metadata is useful because it helps to manage information more efficiently. Depending on the circumstances, it may be helpful and appropriate to disclose metadata publicly. However, documents may contain metadata that is not appropriate to disclose. You may disclose metadata accidentally if you do not realise it is automatically embedded or know how to find and remove it. It is easy for recipients to see it.
How do we reduce the risk of metadata?
You should check documents appropriately for hidden personal information before disclosing them. You should make sure you know how to find metadata in a document and how to remove it, if appropriate.
To help you check for and remove metadata, you could use a software tool (see How do software tools designed to check documents help us avoid accidental breaches?) or convert the file to a simpler format (see How does converting documents to simpler formats help us identify hidden personal information?).
You could remove metadata from an email by saving it in txt (.txt) file format. Alternatively, you could print the email and scan it into a pdf to send it electronically.
If metadata is collected in an image file, you could use photo-editing software to remove it (eg Windows file explorer, or specialist redaction software to extract the image only from the file). You could also consider turning off metadata collection for image files, if this is an option on your device.
Further reading – external guidance
View or change the properties for an Office file - Microsoft Support
What are embedded files and objects?
You can embed files or objects (eg a chart) in documents when you want to insert content from other software. For example, you can insert a Microsoft Word document in an Excel spreadsheet. The embedded file or object becomes part of the file you insert it into (rather than being linked to another information source that is updated if the information changes). If you disclose a file containing an embedded file or object, the recipient can view it without having access to the original information source.
Why are embedded files and objects a risk?
While it is often useful to add files or objects from other software into your document, it may not be obvious that this information is part of the document, and you may disclose it accidentally. Embedded files and objects may be displayed as an icon you can open, for example. Recipients may be able to reveal the information easily by double-clicking on the file or object.
How do we reduce the risk of embedded files and objects?
You should check documents appropriately for hidden personal information before disclosing them. You should make sure you know how to find embedded files and objects in a document and how to remove them, if appropriate.
To help you check for embedded files or objects, you could use a software tool (see How do tools designed to check documents help us avoid accidental breaches?) or convert the file to a simpler format (see How does converting documents to simpler formats help us identify hidden personal information?).
If you find an embedded object in your document that you want to keep, such as a chart, you could consider whether you are able to keep the picture of the chart only whilst removing any hidden information. For example, using the Paste Special option in Microsoft products.
Further reading – external guidance
What is ‘markup’?
If you collaborate with others on a document, it may contain information such as comments, ink annotations or tracked changes. Software tools may be able to display comments, changes made to a document (eg tracked changes in Microsoft Word) or annotations made using a touch screen. This information is sometimes known as markup.
Why is markup a risk?
You may not realise that your document contains comments, ink annotations or tracked changes and disclose this information accidentally. For example, in Microsoft Excel, comments are indicated by a small icon in the cell and only appear if you hover your mouse over the icon. It may also be possible to change the way markup is displayed, so that it is hidden from view. Markup may reveal personal information that it is not appropriate to disclose.
How do we reduce the risk of markup?
You should check documents appropriately for hidden information before disclosing them. You should know how to view comments and remove them, where appropriate.
To help you check for hidden information, you could use a software tool (see How do software tools designed to check documents help us avoid accidental breaches?) or convert the file to a simpler format (see How does converting documents to simpler formats help us identify hidden personal information?).
If you only want to print a document without comments, you may be able to specify this in printing options.
Further reading – external guidance