Defining human disease at scale using health data
Defining human diseases using rich phenotyping information from multiple sources (e.g., genomic, clinical and imaging data) can lead to a better understanding of the mechanisms of disease and lead to development of novel, more effective, therapeutic agents.
The MyAV·¶ GSK Phenomics Hub is a partnership between MyAV·¶ Institute of Health Informatics (PIs Denaxas, Torralbo and Fitzpatrick) and GSK Genomic Sciences to deliver state-of-the-art, electronic health record (EHR)-derived phenotyping infrastructure across three major contemporary biobanks: the , , and (upon release).Ìý
Aims of the Hub
The partnership builds on the Group’s previous research on computational approaches for defining and evaluating phenotyping algorithms using electronic health records (EHR).
Working closely with the team at GSK, the Phenomics Hub is now focussing on carrying out ‘deeper’ phenotype development and validation of phenotypes. This will add novel data elements such as biomarker and prescription data to be able to define disease progression and severity phenotypes across multiple EHR-linked biobanks.
Our collaboration also aims to create opportunities for a bidirectional knowledge exchange to drive disease severity and progression to phenotyping projects.Ìý
Impact Ìý
Our novel phenotyping approach and published set of resources can be used for research to benefit areas including:
- Drug development: accurately define relationships between therapeutic targets and disease endpoints; identify different underlying mechanisms of disease to enable the development of new molecules that target specific subpopulations who share common aetiology mechanisms.
- Personalized medicine: avoid prescribing medications to patients that is not beneficial or may develop adverse outcomes; target patients with a therapy that will benefit them.
- Randomized trials: inform inclusion/exclusion criteria; make trials safer and more effective; better outcomes ascertained in clinical trials of new medicines to inform sample size calculation and patient recruitment strategies.
For more information about the Hub, contact Natalie Fitzpatrick n.fitzpatrick@ucl.ac.ukÌý
Ìý
Phenomics Hub Team at Institute of Health Informatics
Our team includes experienced health data scientists and software engineers and project management staff:Ìý
Principal Investigator. Professor of Biomedical Informatics
ÌýCo-Principal Investigator. Senior Research Fellow
ÌýCo-Principal Investigator. Research Programme Manager
Health Data Scientist
Health Data Scientist
Senior Research Fellow
Project CoordinatorÌý
Ìý
- Outputs related to this collaboration
Publications
Carrasco-Zanini J,ÌýPietzner M,ÌýDavitte J,ÌýSurendran P, Croteau-Chonka DC,ÌýRobins C,ÌýTorralbo A,ÌýTomlinson C,ÌýFitzpatrick NK,ÌýYtsma C,ÌýKanno T,ÌýGade S,ÌýFreitag D, Ziebell F,ÌýDenaxas S,ÌýBetts JC, Wareham NJ,ÌýHemingway H, Scott RA,ÌýLangenberg C. Proteomic prediction of common and rare diseases. Under review: New England Journal of Medicine; published as a preprint at medRxiv:
Chung SC, Providencia R, Sofat R, Pujades-Rodriguez M, Torralbo A, Fatemifar G, Fitzpatrick NK, Taylor J, Li K, Dale C, Rossor M, Acosta-Mena D, Whittaker J, Denaxas S. Incidence, morbidity, mortality and disparities in dementia: A population linked electronic health records study of 4.3 million individuals. Alzheimers Dement. 2023 Jan;19(1):123-135. doi: 10.1002/alz.12635. Epub 2022 Mar 15. PMID: 35290719; PMCID: PMC10078672.
Chung SC, Sofat R, Acosta-Mena D, Taylor JA, Lambiase PD, Casas JP, Providencia R. Atrial fibrillation epidemiology, disparity and healthcare contacts: a population-wide study of 5.6 million individuals. Lancet Reg Health Eur. 2021 Aug;7:100157. doi: 10.1016/j.lanepe.2021.100157. PMID: 34405204; PMCID: PMC8351189.
Denaxas S, Shah AD, Mateen BA, Kuan V, Quint JK, Fitzpatrick N, Torralbo A, Fatemifar G, Hemingway H. A semi-supervised approach for rapidly creating clinical biomarker phenotypes in the UK Biobank using different primary care EHR and clinical terminology systems. JAMIA Open. 2020 Dec 5;3(4):545-556. doi: 10.1093/jamiaopen/ooaa047. PMID: 33619467; PMCID: PMC7717266.
Denaxas S, Gonzalez-Izquierdo A, Direk K, Fitzpatrick NK, Fatemifar G, Banerjee A, Dobson RJB, Howe LJ, Kuan V, Lumbers RT, Pasea L, Patel RS, Shah AD, Hingorani AD, Sudlow C, Hemingway H. UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER. J Am Med Inform Assoc. 2019 Dec 1;26(12):1545-1559. doi: 10.1093/jamia/ocz105. PMID: 31329239; PMCID: PMC6857510.
Kuan V, Denaxas S, Gonzalez-Izquierdo A, Direk K, Bhatti O, Husain S, Sutaria S, Hingorani M, Nitsch D, Parisinos CA, Lumbers RT, Mathur R, Sofat R, Casas JP, Wong ICK, Hemingway H, Hingorani AD. A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service. Lancet Digit Health. 2019 May 20;1(2):e63-e77. doi: 10.1016/S2589-7500(19)30012-3. PMID: 31650125; PMCID: PMC6798263.
Conference proceedings
Davitte J, Croteau-Chonka DC, Gade S, Ziebell F, Surendran P, Wang Q, N Bowker, Ehm M, Torralbo A, Denaxas S, Fitzpatrick N, Ytsma C, Betts J, Scott R, Robins C. Integration of Phenome-Wide Time-To-Event Modeling with Genetic Colocalization Results for 2,941 Plasma Proteins and 310 Diseases in 44,896 UK Biobank Participants. American Society of Human Genetics 2023; Washington (accepted).
TorralboÌý A, Ytsma C, Fitzpatrick NK, Tomlinson C, Denaxas S. Defining and redefining human disease at scale in the UK Biobank: a framework for disease phenotyping algorithm development and evaluation. American Medical Informatics Association 2023; New Orleans (accepted).
Ìý