Methods of disclosure control: Best practice within longitudinal studies

CLOSER thanks everyone who attended the Knowledge Exchange Workshop, Methods of disclosure control: Best practice within longitudinal studies on Wednesday 18th January 2017 from 10:00 to 16:00 in the Woburn House Conference Centre, London.

About the event

Longitudinal studies depend on a trust relationship with their participants. It is therefore crucial that those managing longitudinal data are aware of the risks relating to disclosing participant information. With study participants asked to provide personal, and at times, sensitive details about themselves, it is essential that data managers employ best practice in disclosure control to ensure no individual is identifiable from published data. These risks occur throughout the data collection pipeline (i.e. from data acquisition, through processing, to dissemination to analysts and finally publication into the public domain). Best practice is likely to emerge from a mix of:

traditional data management techniques;
the application of new and innovative statistical and technological mechanisms; and,
working within governance frameworks for managing personal information and information security.

At this workshop, delegates from the longitudinal community, including data managers and researchers, met to share information and insight into current methods of disclosure control and their implementation in longitudinal studies.

Summary of discussions

Throughout the morning, delegates were given in-depth overviews of current disclosure control frameworks and statistical disclosure control techniques including:

the UK Anonymisation Decision-Making Framework
the ‘synthpop’ synthetic data generating package
Methods for ‘deterministic anonymization’ and displaying graphical information in a privacy preserving manner
A theory for creating anonymised data files through adding reversible statistical noise to data
the ‘DataSHIELD’ federated software system which allows analysts to query individual level data without providing access to the underlying individual records.

As the session came to a close, it was noted that while each of the statistical/computational methods represents an exciting innovation within the disclosure control field, each also comes with its own limitations. Whilst it’s essential that uncertainty be added to datasets, it was agreed that different contexts would require different techniques to ensure best practice.

During the afternoon, delegates heard from the data managers from CLOSER partners, including the Centre for Longitudinal Studies, the Avon Longitudinal Study of Parents and Children, the MRC National Survey of Health and Development, the MRC Lifecourse Epidemiology Unit at the University of Southampton and the UK Data Service, who detailed the current approaches to disclosure control used in their respective studies and services. There was some variation in practices adopted by the various studies and service providers. Commonly reported practices included creating bespoke datasets for researchers (although there was differing approaches to the ‘sub-setting’ of bespoke datasets to include only relevant variables, ensuring data requests are from credible sources (variously known as ‘bona fide’ or ‘safe’ researchers) and the requirement for data use contracts. It was noted that the use of 3^rd party routine records (e.g. health record spruced from the NHS) required the use of additional safeguards.

The workshop’s final discussion centred predominantly on identifying where and why a potential breach of information may occur and recognising best practice going forward to reduce the risk of any such incident occurring by developing a community approach to disclosure control standards within longitudinal studies.

Speakers:

Andy Boyd, University of Bristol
Kieron O’Hara, University of Southampton
Gillian Raab, University of Edinburgh
Demetris Avraam, University of Bristol
Harvey Goldstein, UCL & University of Bristol
Paul Burton, University of Bristol
Louise Corti, UK Data Archive
Jon Johnson, University College London
Philip Curran, University College London
Vanessa Cox, University of Southampton

Slides available: Methods of disclosure control: Best practice within longitudinal studies

About the event

Summary of discussions

Speakers: