Rapid growth in the genetic sequencing of pathogens in recent years has led to the creation of large sequence databases. This aggregated sequence data can be very useful for tracking and predicting epidemics of infectious diseases. However, the balance between the potential public health benefit and the risk to personal privacy for individuals whose genetic data (personal or pathogen) are included in such work has been difficult to delineate, because neither the true benefit nor the actual risk to participants has been adequately defined. Existing approaches to minimise the risk of privacy loss to participants are based on de-identification of data by removal of a predefined set of identifiers. These approaches neither guarantee privacy nor protect the usefulness of the data. We propose a new approach to privacy protection that will quantify the risk to participants, while still maximising the usefulness of the data to researchers. This emerging standard in privacy protection and disclosure control, which is known as differential privacy, uses a process-driven rather than data-centred approach to protecting privacy.
Copyright © 2014 Elsevier Ltd. All rights reserved.