Data harmonisation ShareThis

Data harmonisation involves recoding or modifying variables so that they are comparable across research studies.

What does harmonising data involve and why is it important?

In order to make full use of the cohort and longitudinal studies that we have in the UK, we need to be able to make comparisons both within and across studies. Repeating the same longitudinal analysis across a number of studies allows researchers to test whether results are consistent across studies, or differ in response to changing social conditions.

Cross-cohort analysis helps us understand more about societal change and how changes in the policy environment impact on outcomes for individuals.

What are the challenges?

Different studies have used different methods to collect information on important aspects of respondents’ lives. For example, measures of household income and measures of some senses, such as vision, are collected in quite different ways both within the studies over time and crucially, across the separate studies.

What is CLOSER doing about it?

Under the data harmonisation work stream, CLOSER is currently working on 16 work packages:

Work Package 1: Harmonisation of measures of body size and body composition

The team is based at the MRC Unit for Lifelong Health and Ageing at UCL, and is led by Rebecca Hardy with Will Johnson. It harmonised body size and body composition variables across cohort studies; height, weight, and BMI. They published a paper in PLOS Medicine journal in May 2015 outlining their findings. Will Johnson also presented at CLOSER’s cross-cohort research workshop in September 2015.

Harmonised Height, Weight and BMI in five longitudinal studies is now available to download from the UK Data Service.

Work Package 2: Harmonisation of socio-economic status and qualifications
This project is led by Claire Crawford from the Institute for Fiscal Studies with strategic oversight from Anna Vignoles, University of Cambridge. The harmonisation work is being done by a team of researchers across the studies involved. The socioeconomic measures that have been used by studies range from family income, family socioeconomic status based on the occupation of father/mother, the physical surroundings of the family and the characteristics of their locality. This WP is developing harmonised socioeconomic measures that have value across different disciplines.

Claire Crawford presented some of their findings at CLOSER’s cross-cohort research workshop in September 2015. A working paper has been made available by the Institute of Fiscal Studies, authored by Belfield. C, et al..

Work Package 2(a): Harmonising earnings and income within and across studies
Chris Belfield from the Institute of Fiscal Studies, working with Alissa Goodman, Claire Crawford and Di Kuh will extend the harmonisation of socio-economic variables done in Work Package 2. The project will harmonise further measures of earnings and income in NSHD, NCDS, BCS70 and MCS. The harmonised measures will be used to look at how returns to education have changed over the lifecycle and across cohorts.

Work Package 3: Harmonisation of strategies for analysing biological samples
Biological samples such as blood, urine and saliva are collected routinely by many cohort studies. Across the cohorts different methodologies have been used for processing, storing and analysing biological samples. This work package, which started in late 2014, seeks to develop strategies for the harmonisation of future sample collection and use of existing collections across CLOSER, including piloting novel laboratory methodology. The work package is led by Dr Susan Ring at the University of Bristol, with support from Dr Alix Groom and a Research Associate.

Work Package 4: Harmonising measures of senses and behaviours
This work package is led by Jugnoo Rahi at the Institute of Child Health (ICH), UCL and seeks to harmonise measures of eyesight across all the national cohorts. Bio-physical measures of vision and refraction will be harmonised together with the extensive self-reported data on visual function and ophthalmic disorders in each of these cohorts.

The research team published a paper – ‘Trends in Visual Health Inequalities in Childhood Through Associations of Visual Function With Sex and Social Position Across 3 UK Birth Cohorts’ – in JAMA Ophthalmology in August 2017.

Work Package 9: Prospective associations between childhood environment and adult mental wellbeing
Mai Stafford, at the MRC Unit for Lifelong Health and Ageing at UCL will lead a new harmonisation project using data from NSHD, the 1958 National Child Development Study (NCDS), the 1970 British Cohort Study (BCS70), and the Hertfordshire Cohort Study (HCS) to look at associations between childhood socio-economic position and mental wellbeing in adults.

The research team published its first paper in October 2017 – ‘Childhood socioeconomic position and adult mental wellbeing: evidence from British birth cohort studies’ – on the PLOS ONE website.

Work Package 10: Review of methods for determining pubertal status
Janis Baird and Hazel Inskip, from the MRC Lifecourse Epidemiology Unit at the University of Southampton will lead a project to harmonise measures of determining pubertal status. Stage of puberty is valuable in the assessment of various health outcomes in adolescents. The hormonal changes associated with puberty can impact both on physical and mental well-being. This work package will identify and assess the validity of measures used to assess pubertal status, and identify barriers to these assessments and acceptability of the various approaches, through consultation with a children’s PPI (patient and public involvement) committee.

Download the ‘Review of methods for determining pubertal status and age of onset of puberty in cohort and longitudinal studies‘.

Work Package 11: Exploiting the existing biomarker data available in CLOSER
This project, led by Meena Kumari at ISER, University of Essex will produce a comparative catalogue of biomarkers available across CLOSER studies. This will be used to inform the cross cohort research questions that seek to utilise biomarkers and the social biological research agenda. Specifically it will look at which biological markers are typically included in the construction of allostatic load (as the ‘wear and tear’ associated with the response to chronic or repeated stress).

Download the project’s ‘Guide to the biomarker data in the CLOSER studies’.

Work Package 12: Publication metadata augmentation
Olly Butters from the University of Bristol will lead a work package on publication metadata augmentation. This will build a generalised open source tool to create an enhanced bibliographic dataset and analytics of the academic papers published in a study, and apply this to multiple CLOSER studies. The tool, once developed, will be made publicly available under an open source licence.

Work Package 13: Overcrowding and health: Methodological Innovation for socioeconomic measure in longitudinal studies
Noriko Cable from the International Centre for Lifecourse Studies in Society and Health (ICLS) at UCL will lead a work package on overcrowding and health: methodological innovation for socioeconomic measures in longitudinal studies. Overcrowding in housing has long been treated as a proxy indicator of material deprivation. The project will aim to offer comparable overcrowding measures across several CLOSER studies; NSHD, NCDS, BCS70, MCS, BHPS and Understanding Society.

Work Package 15: Socioeconomic differentials in physical activity by age and cohort: enhancing the CLOSER cohort resource to inform research, policy and practice

Rachel Cooper from the MRC Unit for Lifelong Health and Ageing at UCL will lead a project utilising data from six CLOSER studies to address two main objectives: identify all measures of physical activity and sedentary behaviour available within each study, document these and, undertake data harmonisation where possible. Secondly to test whether associations between lifetime socioeconomic position and physical activity vary by age, sex and ethnicity in Understanding Society, and later by age, sex and birth year in coordinated analyses of the British birth cohorts and ALSPAC.

Work Package 16: Maximising the take up of mental health measures from UK cohorts and longitudinal studies

Alison Park, CLOSER Director, and Louise Arseneault, ESRC Mental Health Leadership Fellow, will lead a project to compile, organise, generate and disseminate information about existing measures of mental health and wellbeing in UK cohorts and longitudinal studies to make them more visible and accessible to a wide group of researchers across different disciplines. The project will develop a survey of available mental health and wellbeing measures in the UK and international cohorts studies; create a web platform of mental health measures in the cohorts; and promote the use of mental health measures in the studies.

Work Package 17: Scoping existing dietary data available in CLOSER to support cross-cohort research questions

This project, led by Jane Maddock from the MRC Unit for Lifelong Health and Ageing at UCL, aims to document, describe and make comparisons between the available dietary intake information across CLOSER cohorts. It will also examine the association between harmonised measures of diet across the life course and allostatic load, as a marker of ageing.

Work Package 18: The creation of a life course methylome through data harmonisation in CLOSER studies

Esther Walton from the MRC Integrative Epidemiology Unit at the University of Bristol is leading a project to synthesise a life course methylome by combining, harmonising and analysing DNA methylation data from CLOSER and non-CLOSER studies spanning age groups from 0 to 100. This map of variation in methylation across the life course will help identify eras that show most variability and these may highlight critical windows for healthy development or disease vulnerability.

Work Package 19: Assessment and harmonisation of cognitive measures in British birth cohorts

Vanessa Moulton, Centre for Longitudinal Studies at the UCL Institute of Education, will lead a project to test the reliability and validity of the cognitive measures in the cohort studies. As well as providing guidance for analysts on these measures, this project will be valuable for future cohorts, by identifying which cognitive measures are likely to add most value to future datasets.

Work Package 20: Harmonisation of mental health measures in British birth cohorts

The project, led by George Ploubidis from the Centre for Longitudinal Studies at the UCL Institute of Education, aims to document and then harmonise existing mental health measures over the life course in five British birth cohorts. The harmonised measures will allow the project to investigate and compare the development of common mental disorder over the life course in different generations, as well as test whether mental health is improving or declining in more recently born cohorts that are expected to live longer.