A Burgun - High Throughput Phenotyping

Transcription

A Burgun - High Throughput Phenotyping
PheWAS studies and high throughput phenotyping
Anita Burgun
Paris Descartes School of Medicine, USPC
Hôpital Européen Georges Pompidou (HEGP), APHP
INSERM UMR 1138 eq22
Nov 4th, 2015
1 Personalized and Precision Medicine
« We are not average »
P
Phenome
§ G
E
2 Personalized and Precision Medicine
P
Omics sciences
Phenome
§ G
E
Use genome sequencing to find the genetic basis of a rare undiagnosed disease
Choose cancer treatments that target a genetic defect in the tumor
3 Personalized and Precision Medicine
Public health and
Environmental sciences
P
Phenome
§ G
E
Quantify your risk for specific diseases
Establish goals and strategies for better health
4 Personalized and Precision Medicine
Solve mysterious events or syndromes
Connect to patient’s (family) history
Adapt treatments (pharmacogenetics)
Prevent adverse effects from drugs
Monitor diseases
Medicine and
Biomedical informatics
P
Phenome
§ G
E
5 Coverage of EHR at HEGP -­‐ 
-­‐ 
-­‐ 
-­‐ 
700 beds (cardiology +++ and cancer +++, pharmacogene>cs dept) Paris Descartes University EHR since 2000 : DXC@re (Medasys) DXC@re Coverage : -­‐  Clinical data -­‐  Drug prescrip>on -­‐  Lab test results, imaging, etc OPT-OUT
i2b2 does not require any specific ontology
Queries can be created using any system of local codes
8 Clinical Data Warehouse
HEGP
Type # instances Demographics (age, sex, Hospital vital status) Vital signs (temperature, blood pressure, weight, …) # pa-ents 742 487 15 million 141 725 Diagnoses (ICD 10 codes) 6 million 338 871 Medical Procedures (French CCAM codes) 4 million 285 097 80 million 468 057 4 million 359 547 103 million 379 483 4 million 121 441 Clinical data (EHR DxCare forms) Free Text reports (Hospitaliza>on, Surgery, Imaging, Pathology …) Lab test results Drug prescrip>ons Omics data 9 McCarthy et al, Nature Reviews Genetics, 2008 / Denny et al, Bioinformatics 2010
Activité TPMT
Quantitative trait
vs genotype
ICD 10 codes
Vs ICD9-CM
Lab test results
Neuraz et al. PLOS One 2013
N FDA & EMA
recommendations
Phenotype
Low
activity
Intermediate
activity
Normal
Activity
Thiopurine
dose
10 %
dose
30 – 70 %
dose
100 %
dose
?
Methods : population and data
•  « TPMT cohort »
– 
– 
– 
– 
patients who underwent a TPMTa assay
with at least one ICD10 code or one biological test result
There were no exclusion criteria.
3 groups:
•  Very high activity> 15 mmol/h/mL RBC
•  Normal
•  Low activity (partial and completely deficient TPMTa patients)
•  Control group : no TPMT assessment
•  Phenotypes
ICD 10 codes
11 lab tests : blood cells, glycemia, liver enzymes
Methods : addressing time aspects
why and how ?
•  We had to restrict ourselves to events that occured after the
initiation of treatment
treatment
TPMT : any time
•  2 methods
–  Extraction of the prescriptions from the CPOE system (n = 12)
–  Extraction of the prescriptions from text reports (n = 423)
1.  Select the sentences with drug names
2.  Extract temporal information : start/stop and dates
15 Lab test results : event free survival curves
[Neuraz A et al. Phenome-wide association on a quantitative trait: Application to TPMT enzymatic
activity and thiopurine therapy in pharmacogenomics. PLoS Comput Biology 2013, 9(12):
e1003405]
Discussion
• 
• 
• 
• 
• 
• 
• 
• 
PheWAS using a quantitative trait
PheWAS applied to pharmacogenetics
Phenotypes with timeline
1st PheWAS using ICD10
ICD codes + lab tests à cross-validation
LaboWAS : on-going work with Kyoto University (UGT1A1 variants)
Information extraction from text : easy for medication
Main issue : time (extraction from narrative reports)
•  TPMT:
Phenotype
Low
activity
Intermediate
activity
Normal
Activity
Very High
Activity
Thiopurine
dose
10 %
dose
30 – 70 %
dose
100 %
dose
> 100 %
Dose ?
Population data bases and EHRs
Wei WQ , Denny JC Extracting research-quality phenotypes from electronic health records
to support precision medicine, Genome Medicine, 2015
18 IT suppor>ng large-­‐scale studies 1. Text mining Wei WQ , Denny JC Extracting research-quality phenotypes from electronic health records to support precision medicine,
20 Genome Medicine, 2015
Medical Watson •  In the future, I expect this technology to help us discover new and beber treatments for specific gene>c abnormali>es or associa>ons of gene>c changes. It will be able to help us evaluate rare or complex pa>ent condi>ons and iden>fy drugs that have already been approved for other uses that might also help in the situa>ons we’re dealing with. •  Dr DiNardo, MD Anderson •  hbp://venturebeat.com/2014/10/15/ibm-­‐watson-­‐
health/ RESEAUX SOCIAUX …. 25 55 à 105 millions de personnes aux USA
Coût estimé de 2 à 4 milliards $ annuellement
Sous repérage par les moyens standard car pas de
traitement
Tweets + traitement des données textuelles
+ algorithmes d’apprentissage
Réponse par tweet pour repérage du restaurant si
nécessaire
Towards Using Literature-­‐based Discovery to Explain Drug Adverse Effects Dimitar HRISTOVSKIa, , Anita BURGUN-PARENTHOINEb,
Paul AVILLACHb and Thomas C RINDFLESCHc
a
Institute for Biostatistics and Medical Informatics, Medical
faculty, University of Ljubljana, Slovenia
b University of Rennes, France
c National Library of Medicine, NIH, Bethesda, USA
e-mail: [email protected]
26
Introduc>on / Mo>va>on •  Pharmacovigilance: –  detec>on, –  understanding and –  preven>on of drug adverse effects •  Goal of our research: –  Find pharmacogenomic explana>on of –  Known (drug, adverse effect) pairs 27
Example: Providing Explana>on through an Table 1. Providing an explanation for the reported drug adverse effect (Lithium, Brugada syndrome) through
Intermediate Concept the linking concept Sodium Channel.
Aligned relations for Sodium Channel:
X-Relation-Y
Y-Relation-Z
Sodium
Channel
Lithium
INHIBITS
Sodium
Channel
Lithium can unmask Brugada syndrome
through its ability to block sodium channels ,
even at subtherapeutic concentrations. (PMID:
20016437)
CONCLUSIONS: The widely used drug lithium
is a potent blocker of cardiac sodium channels
and may unmask patients with the Brugada
syndrome. (PMID: 16144991)
Because lithium is a potent blocker of cardiac
sodium channels , and given the critical
importance of sodium channels in pacemaker
activity, lithium-induced sodium channel
blockade is likely an important mechanism in
sinus node dysfunction. (PMID: 17347696)
ASSOCIATED_WITH
Brugada
syndrome
SCN5A, the gene encoding the alpha subunit of the
sodium channel , is the only gene thus far linked to
Brugada syndrome …(PMID: 16415541)
Mutations in SCN5A, a cardiac sodium channel gene,
have been recently associated with Brugada
syndrome . (PMID: 11960580)
Sodium
Channel
CAUSES
Brugada
syndrome
Loss of function mutations in SCN5A, encoding the
cardiac sodium channel , are one cause of the
Brugada syndrome ... (PMID: 16415376)
Changes in the sodium channel are responsible for
long QT syndrome, Brugada syndrome and
conduction defects. (PMID: 17497250)
Sodium
Brugada
PREDISPOSES
Channel
syndrome
A mutation in the human cardiac sodium channel
(E161K) contributes to sick sinus syndrome,
conduction disease and Brugada syndrome in two
families. (PMID: 15910881)
28 IT suppor>ng large-­‐scale studies 2. Data Integra>on 8 SIRIC research programs in France : Curie, LYRIC, BRIO, OncoLille,
OncoPACA, Montpellier, IGR, CARPEM
Cancer Research for Personalized Medicine (Pr Pierre Laurent-Puig)
Multidisciplinary and translational research
Oncology departments
Research groups : basic research, oncogenomics, social sciences, IT
Keyword is integration
Existing platforms??
Canuel V, Rance B, Avillach P, Degoulet P, Burgun A. Translational research platforms
integrating clinical and omics data: a review of publicly available solutions. Brief.
Bioinformatics. 2015;16(2):280‑90
30 OMICS data User database Intranet APHP
CARPEM servers
HEGP
Constructeur de projet tranSMART Fédération d’identités
& de-identification
Control qualité i2b2 CARPEM data Apps
tranSMART Conversion vers standards data-­‐
extracteur REDCap eCRF DB
pii
Pre-­‐
tranSMART Données rétrospec>ves Saisie manuelle REDCap eCohorts Fichiers de données cliniques Etudes cliniques industrielles Intranet
APHP
Internet
APHP
Headquarter
pii: Personally
identifiable
information
p = 0,04
Badoual Cancer Res. 2013 06/05/2013 Study
Clinical
DWH
IT suppor>ng large-­‐scale studies 3. Data Sharing Share the algorithms (case identification + data analysis)
Base na-onale des causes de déces CepiDC Clinique Maladie du système immunitaire Biologie Sang: leucocytes Coloscopie : polype Anapath Médicaments CANCER Aspirine Caisse Nationale
d’Assurance Maladie
SNIIR-AM
Médicaments Visites PMSI hospitalisa>ons Chirurgie Adeno Chimio Toxicité. . . Tumeur Immunoscore Gene expression T cells NGS 37 IT suppor>ng large-­‐scale studies 4. Infrastructures Anita Burgun, MD, PhD
Aviesan, ITMO Santé Publique
Hôpital Européen Georges Pompidou, AP-HP, Faculté de Médecine Paris 5, INSERM Centre de Recherche des Cordeliers eq22
Worflows
Infrastructures
ANAEE: Ecosystems
BBMRI: Biomolecular Resources
EATRIS: Translational Research
ECRIN: Clinical Research
ELIXIR: Biological Information
EMBCR: Marine Biological Resource
ERINHA: Highly Pathogenic Agents
EU-OPENSCREEN: Chemical Biology
EURO BIOIMAGING
INFRAFRONTIER: Model Mammalian Gen.
INSTRUCT: Structural Biology
ISBE: Systems Biology-Europe
MIRRI: Microbial
An opportunity
Standards
Public Health Research Infrastructure
for Sharing of health and Medical
Administrative data
Data volume : genomic data (source : E. Barillot ITMO Cancer Working Group ) The first V is for Volume • 
• 
• 
• 
• 
• 
• 
1 Bit = Binary Digit 8 Bits = 1 Byte 1024 Bytes = 1 Kilobyte (10^3) 1024 Kilobytes = 1 Megabyte (6) 1024 Megabytes = 1 Gigabyte (9) 1024 Gigabytes = 1 Terabyte (12) 1024 Terabytes = 1 Petabyte (10^15) • 
• 
• 
• 
• 
1024 Petabytes = 1 Exabyte (10^18) 1024 Exabytes = 1 ZeZabyte (10^21) (sex-llion) 1024 ZeZabytes = 1 YoZabyte (24) 1024 Yobabytes = 1 Brontobyte 1024 Brontobytes = 1 Geopbyte
41 Big Data •  Volume –  « Big data isn’t much more than a sexier version of sta>s>cs »?? •  Vélocité –  Nécessite de nouveaux programmes informa>que, de nouveaux modèles de bases de données, de nouveaux algorithmes •  Variété –  Intégra>on de données cliniques, génomiques, comportement, etc sur des plateformes sécurisées –  Données « tradi>onnelles » structurées et « big data » non structurées (texte, image, etc) –  Traitement de données hétérogènes •  Véracité –  U>lise/ ré-­‐u>lise les données existantes telles qu’elles sont Purpose of DNALand •  DNA.Land is a place where you can learn more about your genome while enabling scien>sts like us to make new discoveries for the benefit of humanity. The website is not-­‐
for-­‐profit and run by the Erlich and Pickrell labs affiliated with Columbia University and the New York Genome Center. The purpose of DNA.Land is to enable you to learn more about your DNA and allow you the autonomy to share your data to facilitate important scien>fic research at the forefront of genome sciences and medicine. Our goal is to help members interpret their data and connect poten>al par>cipants with research studies. 44 Risks (DNA Land) •  There are no physical risks in par>cipa>ng in this study but there may be informa>on risks. •  We are going to provide you with informa>on about your ancestry, traits, and relatedness with other individuals. You might learn unexpected findings about you or your family. These can include finding a certain ethnic heritage, predisposi>on for a trait, or a non-­‐paternity event in your family (all examples are from people that we know and have worked with). For some, such informa>on is empowering; for others, these findings may cause anxiety and discomfort. 45 Already exists Project Biobanking + Robot Robots (biology) Electronic Health records Care Research Biological samples ADN, Serum, urine… New biomarkers Data warehouse IT suppor>ng large-­‐scale studies 5. IoT Electronic Medical Record
T
E
X
T
S
T
R
U
C
T
U
R
E
D
CHAD2VASC2
DRUG PRESCRIPTION
Your patient, Mrs firstname/lastname ……..
Her cardiovascular risk factors include
hypertension, dyslipemia, and diabetes
…………………..
She presents dyspnea and signs of right heart
failure, including edema, and hepatomegaly,.
The transthoracic ultrasound exam perfomed
by Dr XXX shows aortic valve
calcifications……. LVEF 30%..............
I10 hypertension
E785 hyperlipidemia
E119 diabetes
….
Metformin 500
Rosuvastatin
Celiprolol
INR = no information (date)
Device-transmitted data
AF duration 0:00:38
AF alert CHAD2VASC2
(calculated score)
7
Prevention of TE
(inferred)
NONE
AF duration 0:00:38
AF
alert
Rosier A et al Europace 2015, in press
AF
alert
AF
alert
AF
alert
Phenotyping strategies
Algorithm for the identification of subjects
with Diabetes from EHRs
1. 
ICD code for diabetes in billing
reports OR
2. 
Several instances of high blood
glucose OR
3. 
Elevated HbA1C OR
4. 
Prescription of anti diabetic drug
(insulin or T2D medication) OR
5. 
Diabetes and/or anti diabetic drug
identified from text reports using
information extraction techniques
Electronic Medical Record
CHAD2VASC2
DRUG PRESCRIPTION
Your patient, Mrs firstname/lastname ……..
Her cardiovascular risk factors include
hypertension, dyslipemia, and diabetes
…………………..
She presents dyspnea and signs of right heart
failure, including edema, and hepatomegaly,.
The transthoracic ultrasound exam perfomed
by Dr XXX shows aortic valve
calcifications……. LVEF 30%..............
I10 hypertension
E785 hyperlipidemia
E119 diabetes
….
Metformin 500
Rosuvastatin
Celiprolol
INR = no information (date)
Rosier A et al Europace 2015, in press
50 FORMAL ONTOLOGY
(description logics)
Integrate and unify
data
Model
domain knowledge
Reasoning
Rosier et al, IEEE Biomed Health
Inform 2015
Rosier et al, Europace 2015
51 Objets connectés 52 IT suppor>ng large-­‐scale studies 6 . New designs for clinical trials New designs in the era of precision medicine
56 57 Redhouanne Abdellali Philippe Beaune Guy Bensoussan Gilles Chatellier Vincent Canuel Hector Contournis Patrice Degoulet Jean-­‐Bap>ste Escudié Jean-­‐Francois Ethier Aurélie Névéol Nicolas Garcelon Anne Dominique Pham Dimitar Hristovski Bas>en Rance Sandrine Katsahian Grégoire Rey Anne-­‐Sophie Jannot Eric Zapletal Marie-­‐Anne Loriot Sarah Zohar Pierre Laurent-­‐Puig Antoine Neuraz [email protected] 

Documents pareils