The ICO exists to empower you through information.

Over a hundred years passed between the initial discovery of DNA and the identification by Franklin, Crick and Watson of the helical structure of DNA in 1953. It then took nearly 50 years of research before the entire human genome was sequenced in 2003. In the subsequent 20 years, genomic technologies had developed rapidly. A genome sequencing that once took 13 years to deliver can now be completed in a matter of days. 51

Yet it is not only the speed of analysis that has changed; the scale of data collected is also likely to expand. 52 A short read sequence of a genome might create up to 160 gigabytes (GB) of data, with a long-read sequence creating up to 500GB. 53 As these numbers are scaled up by hundreds of thousands and potentially into the millions, stakeholders have noted storage issues, its energy and environmental impact and security challenges for both organisations and people who might wish to hold their genomic information.

The information and inferences that you can draw from this information has also significantly increased. It is about genetic variation, which is the differences in DNA sequence between people. These differences can be common (variations which arose a long time ago and spread through the population) or rare (arose more recently and have not spread, perhaps because they are damaging to health). Genomics seeks to identify common and rare variants that correlate with disease and understand how they work. Sometimes the variants are not in genes (in non-coding region of the genome) and have subtle effects that are hard to map to genes or biological processes and pathways. Yet discoveries will continue to be key to critical healthcare and medical therapies to treat illnesses and conditions such as macular degeneration, type 2 diabetes, prostate cancer and treatments for COVID-19. 54

The UK’s own recent pursuit of genomic research begins with the 2011 UK Life Sciences Strategy, followed by the subsequent launch of 100,000 Genomes Project. The project delivered the DNA sequences of 100,000 NHS patients with either cancer or rare conditions as part of a development of emerging treatments. 55 In 2013, Genomics England was created to oversee the project and in 2016, the NHS Genomic Medicine Service (GMS) was created to build upon this work and to integrate genomic medicine into UK healthcare. 56 The project was completed in 2018. As well as these bodies, research organisations such as UK Biobank, Our Future 57 and the National Institute for Health Research (NIHR) are also supporting further work on the use of genomics in the UK for health and non-health related purposes. 58

In terms of actually analysing genomic information, there is increasingly a move to use genome wide association studies (GWAS) as a means of focusing on the ‘missing heritability’ problem for common traits. 59 However, while GWAS is the paradigm of choice, it typically only assesses common DNA variants using SNP arrays. This focuses analyses and inferences upon commonly recognised traits and characteristics. A whole genome sequencing (WGS) is needed for an assessment of all types of variants, currently posing some barriers in terms of cost and time and well as analysing larger quantities of personal data.


51 Super-speedy sequencing puts genomic diagnosis in the fast lane

52 Big Data: Astronomical or Genomical?

53 Storage and Computation Requirements

54 Genomics Beyond Health - full report

55 100,000 Genomes Project

56 NHS Genomic Medicine Service

57 NHS Genomic Medicine Service

58 Genomics Beyond Health - full report

59 ‘Missing’ refers to the gap between what all common variants in a GWAS capture (’SNP heritability), and twin-based h2 estimates. This missing gap indicates other types of variants not assessed in the GWAS might be important, such as structural variants and rare(r) variants.