Welcome to the Health Data Science group
The Health Data Science (HDS) group aims to develop analytical methods and tools to enable data-driven healthcare. We apply advanced machine learning and statistical methods to develop clinical prediction models at scale in distributed data networks.
Clinical decision making is a complicated task in which the clinician has to infer a diagnosis or treatment pathway based on the available medical history of the patient and the current clinical guidelines. Clinical prediction models have been developed to support this decision-making process and are used in clinical practice in a wide spectrum of specialties. These models predict a diagnostic or prognostic outcome based on a combination of patient characteristics, e.g. demographic information, disease history, treatment history. The number of publications describing clinical prediction models has increased strongly over the last 10 years as shown in the figures below.
Surprisingly, most currently used models are estimated using small datasets and contain a limited set of patient characteristics. This low sample size, and thus low statistical power, forces the data analyst to make stronger modeling assumptions. The selection of the often limited set of patient characteristics is strongly guided by the expert knowledge at hand. This contrasts sharply with the reality of modern medicine wherein patients generate a rich digital trail, which is well beyond the power of any medical practitioner to fully assimilate.
Presently, health care is generating a huge amount of patient-specific information contained in the Electronic Health Records (EHR). This includes structured data in the form of diagnoses, medications, laboratory test results, and unstructured data contained in clinical narratives. This opens unprecedented possibilities for research and ultimately patient care. Effective exploitation of these massive dataset demands novel methodology and an interdisciplinary approach. This is where our group wants to play an important role. We aim to asses how much predictive performance can be gained by leveraging the large amount of data originating from the complete EHR of a patient.
However, actual use of these databases in a multi-center study is severely hampered by a variety of challenges, e.g., each database has a different database structure and uses different terminology systems. In an ideal world, a harmonized approach would be available by which data and results from different databases could be combined to answer a specific research question. Standardized data models and common analytical tools should become a de facto standard. Our group, therefore, collaborates closely with the Observational Health Data Sciences and Informatics (OHDSI) initiative (www.ohdsi.org) that is responsible for the development of the OMOP-CDM, and leads its European Chapter (www.ohdsi-europe.org) to support its adoption in Europe.