- What is secure-multiparty computation (SMPC) and what does it do?
- How does SMPC assist with data protection compliance?
- What do we need to know about implementing SMPC?
- What are the risks associated with using SMPC?
What is secure-multiparty computation (SMPC) and what does it do?
SMPC is a protocol (a set of rules for transmitting information between computers) that allows at least two different parties to jointly process their combined information, without any party needing to share all of its information with each of the other parties. All parties (or a subset of the parties) may learn the result, depending on the nature of the processing and how the protocol is configured.
SMPC uses a cryptographic technique called “secret sharing”. This refers to the division of a secret and its distribution among each of the parties. This means that each participating party’s information is split into fragments to be shared with other parties. Secret sharing is not the only way to perform SMPC, but it the most common approach used in practice.
Each party’s information cannot be revealed to the others unless some proportion of fragments of it from each of the parties are combined. As this would involve compromising the information security of a number of different parties, in practice it is unlikely to occur. This limits the risks of exposure through accidental error or malicious compromise and helps to mitigate the risk of insider attacks.
Example
Three organisations (Party A, Party B and Party C) want to use SMPC to calculate their average expenditure. Each party provides information about their own expenditure – this is the “input” that will be used for the calculation.
SMPC splits each party's information into three randomly generated "secret shares". For example, Party A's input – its own total expenditure – is £10,000. This is split into secret shares of £5,000, £2,000 and £3,000. Party A keeps one of these shares, distributes the second to Party B and the third to Party C. Parties B and C do the same with their input data.
Party | Input data | Secret share 1 (to be kept) | Secret share 2 (to be distributed) | Secret share (to be 3 distributed) |
A | £10,000 | £5,000 | £2,000 | £3,000 |
B | £15,000 | £2,000 | £8,000 | £5,000 |
C | £20,000 | £7,000 | £4,000 | £9,000 |
When this process is complete, each party has three secret shares. For example, Party A has the secret share it retained from its own input, along with a secret share from Party B and another from Party C. The secret shares cannot reveal what each party's input was (ie Party A does not learn the total expenditure of Parties B or C), and so on.
Each party then adds together their secret shares. This calculates a partial result both for each party and the total expenditure of all three. The SMPC protocol then divides the total by the number of parties – three, in this case – giving the average expenditure of each: £15,000.
Party | Input data | Secret share kept | Secret share Received | Secret share Received | Partial Sum |
A | £10,000 | £5,000 | £4,000 | £5,000 | £14,000 |
B | £15,000 | £2,000 | £2,000 | £9,000 | £13,000 |
C | £20,000 | £7,000 | £8,000 | £3,000 | £18,000 |
Total expenditure (sum of £45,000 partials)
Average expenditure (total £15,000 divided by number of parties)
No single party is able to learn what the other's actual expenditure is.
You should note that this is a simplified example. In reality, additional calculations on the secret shares are required to ensure the value of the shares cannot be leaked.
How does SMPC assist with data protection compliance?
SMPC is a way to ensure that the amount of information you share is limited to what is necessary for your purposes, without affecting the utility or accuracy of the data. It can help you to demonstrate:
- the security principle, as the inputs of other parties are not revealed, and internal or external attackers cannot easily change the protocol output; and
- the data minimisation principle, as no one should learn beyond what is absolutely necessary. Parties should learn their output and nothing else.
SMPC can also help to minimise the risk from personal data breaches when performing processing with other parties. This is because the shared information is not stored together, and also when it is being processed by separate parts of the same organisation.
If your purposes require you to provide personal information to the SMPC computation, you must assess whether the information you receive from the output is personal information. You should consider applying differential privacy to the output to further reduce risks of identifiability.
What do we need to know about implementing SMPC?
SMPC is an evolving and maturing concept. It may not be suitable for large scale processing activities in real-time, as it can be computationally expensive. There are some other SMPC operations that can be challenging in practice, including:
- using SMPC to replace missing information with substituted values;
- eliminating duplicate copies of repeating information; and
- record linkage where matches in data sets to be joined are inexact.
Currently, effective use of SMPC requires technological expertise and resources. This may mean that you cannot implement SMPC yourself. However, SMPC has different deployment models, meaning that it may be possible for you to use it. These include:
- the delegated model -this outsources the computations to a trusted provider. It can also be a good approach if you are reluctant to participate in the protocol due to security and confidentiality concerns. For example, the risk of collusion between other parties or mismatched levels of security between parties; and
- the hybrid model - this involves an external provider running one of the servers, while you run the other in-house, using the same technical and organisational measures. This approach still requires a solid understanding of the technology.
Above a certain "threshold” (number of secret shares), it may be possible for the input data to be reconstructed (eg by one or more of the parties, or an attacker), if the secret shares are combined together. Therefore, you should determine what the appropriate threshold your use of SMPC involves.
The threshold for reconstruction influences the risk of collusion and reidentification. The required threshold depends on the threat model used. A threat model that requires a greater proportion of the parties to be honest poses a higher risk than one that requires a lower proportion. For example, if all but one parties must be honest, then compromise of two parties would undermine the security of the protocol. Furthermore, some attack models may allow more than one party to act maliciously.
There are several parameters that you should consider when you determine the appropriate number of shares. These include:
- the number of parties involved;
- the underlying infrastructure you intend to use;
- the availability of that infrastructure; and
- the calculations you intend to make and the input data required.
To avoid collusion between parties, you should ensure appropriate trust mechanisms are in place, particularly if multiple parties involved in the process use the same underlying infrastructure. These may include robust access controls, logging and auditing mechanisms and a strong contractual framework.
You could need to obtain further expertise in secret sharing when assessing the context and purpose for your use of SMPC. For most use cases, an organisation would typically not develop SMPC directly, but rather use a protocol designed by an expert cryptographer.
What are the risks associated with using SMPC?
SMPC protocols are designed for a variety of threat models that make assumptions about an attacker’s capabilities and goals. The models are based on allowed actions that dishonest parties are allowed to take without affecting its privacy properties. This is an important underlying concept behind the design of SMPC.
An SMPC protocol can be compromised, resulting in reconstruction of the input data or the results of the computation being incorrect. For example, an external entity or a participating party can act in bad faith. In the SMPC context these are known as ‘corrupted parties’.
You should distinguish between the security and trust assumptions associated with secret sharing and the trust assumptions associated with the analysis. For example, a dishonest party can faithfully follow the protocol, and act as an honest participant in the secret sharing protocol. But also can use knowledge of its own data, or use false data, to learn something about the other party’s (or parties’) information through similar techniques as those used in differencing attacks.
The security model appropriate for your circumstances depends on the level of inherent risk of a malicious party learning something about a person. Or corrupting their inputs so that it may have a detrimental effect on someone.
Generally, if you are using stronger threat models you will use more computational resources as further checks are required to ensure that the parties are not acting in bad faith. You should perform intruder testing on the SMPC protocol operation using the threat model assumptions for a given adversary, as provided in the design of the protocol. For example, you should test the impact of corrupted inputs on the computation and the security of the communications channels between the parties.
By design, using SMPC means that data inputs are not visible during the computation, so you must carry out accuracy checks to ensure the inputs have not been manipulated by a corrupted party. You could do this in several ways, such as:
- ensuring the design has measures in place to protect against corruption of the input values (eg a process for checking the input values and contractual requirements on accuracy);
- ensuring that data validation and correction is part of the SMPC protocol you choose, and that both processes are executed on the inputs;
- checking the output after the computation is complete, so you can evaluate whether the result is true (this process is known as “sanity checks”);
- bounds checking to ensure values are not corrupted; and
- ensuring technical and organisational measures are in place (eg robust access controls, logging and auditing mechanisms to mitigate the risk of collusion between parties).
SMPC protects information during the computation but does not protect the output. Where the output is personal information, you should implement appropriate encryption measures for information at rest and in transit to mitigate the risk of personal information being compromised.
Further reading – ICO guidance
Read the section of this guidance on identifiability for more information on the motivated intruder test and assessing the identifiability of personal information.
Further reading
The publications below provide additional information on implementation considerations, threat models and use cases for SMPC.
For an extensive overview of SMPC, including an assessment of various methods and a summary of what problems it can solve, see “A Pragmatic Introduction to Secure Multi-Party Computation” (external link, PDF).
ENISA’s 2021 report “Data Pseudonymisation: Advanced Techniques and Use Cases” summarises SMPC.