News
Interview with Johan van der Lei from Erasmus University 03 Jan 2018

Today’s post is an interview with Prof. Johan van der Lei (Head of the Department of Medical Informatics) at Erasmus University. The Department has a leading role in the Innovative Medicines Initiative project EMIF that develops a European infrastructure to facilitate the re-use of medical data. Van der Lei’s research is concentrated on the development, evaluation, use, and impact of computer-based patient records with particular focus on primary care and on the re-use of medical data originating from diverse settings and different countries.

He will introduce us to real-world evidence as a learning opportunity and tell us about the roots of his work that led to the current research Work Package 4 is conducting, as well as how the Disease Modelling and Simulation team – which is co-led by him and Billy Amzal – contributes to a potential ROADMAP phase 2.

Why are you doing research on real-world evidence (RWE) and real-world data (RWD) in this project?

I believe that one of the duties we have as researchers is that we should also learn from the data delivered in routine care. Routine care is not only an environment where people are treated, get medicine prescribed, and are operated. Routine care is an environment where we have the opportunity to learn from experience as it develops. I often say that every time you write a prescription for a patient you are on one hand treating the patient, but on the other hand, you’re also creating a learning opportunity.  Not only that you could learn from this patient to provide care for that individual – but also learn for the benefit of all. This patients’ journey might benefit a whole group of patients who get the same treatment or face the same disease.



We should view healthcare not only as providing services, so to say, as delivering ‘products’ – products in a positive sense. We should also view it as a learning environment. One of the things we have done insufficiently in the past is that we have not learned enough from the data as it is being collected in the routine delivery of healthcare. Every prescription is an experiment, and every picture we take of somebody’s brain ideally is a learning opportunity. Health care is not just about delivering optimal care. Health care is also about learning from experience.

How did your work on RWD evolve over time?

The idea that we want to learn from the day-to-day delivery of care has prompted and controlled my research for more than 30 years. Initially, in days long gone, I was involved in building electronic patient records. When we had medical data only on paper, it was very difficult to learn.  Analysing data available only on paper requires an enormous amount of work. In order to learn efficiently from the delivery of health care, the very first thing we had to do, now over  30 years ago, was to develop electronic patient records so we could begin collecting the data in an electronic format.

The second task we had – 20 years ago if not more – was to build databases that contained healthcare data so that researchers actually could use the data for specific studies.  In recent years, we started doing studies in multiple countries and replicating the studies in multiple environments.

This is exactly what we are trying to pursue now in ROADMAP as well. We know that there is a rich variety of data sources available. For me, the bottom-line of all of ROADMAP is our desire to learn from the available data and to apply that for the benefit of patients in the domain of the Alzheimer’s population. In that process of learning from the data we are addressing a number of issues. One of the issues we are addressing is building disease progression models. In this context, one of the things we try to do is to use the available data to validate existing disease progression models and to improve existing disease progression models.

At the end of the day, ROADMAP is about trying to create an environment where we can learn from the available data, build disease models based on the data for this patient population, and validate those disease models in multiple environments.

What concrete steps have you focussed on?

We started with an inventory from the current literature and made an overview of the published disease progression models. We also included disease progression models currently being developed by our partners in ROADMAP. From this list of models, we have selected a few models we would like to validate.

When you build a model you typically rely on a certain population using data captured in a certain setting. As a result, the disease progression model represents also the environment in which the data were collected. If you look at disease progression models published in the literature, you will typically find in the introduction section of the paper phrases such as ‘we created this model using the data from a region in Sweden’,  ‘patients visiting the memory clinics in Holland’, or ‘the source population consists of the patients seen by a General Practitioner (GP) in England’. You will find that disease models typically have as a starting point data from a specific  population – that data is then used in developing the model.

We have selected a number of those disease progression models and we are now in an exercise where we want to validate those disease models in other settings.  This is known in research as external validation of a model. You can develop a disease model in one environment, but it is not necessarily true that the model behaves in the same way in another setting. The purpose of external validation is actually to test whether that model also provides good results, or similar results, when you apply it in a different setting.

For example, one of the models we are validating was developed by Ron Handels and created by using data from a specific population in a region in Sweden. This disease model predicts Mini Mental State Examination (MMSE) scores after the diagnosis of dementia. Given the diagnosis of dementia and the date of that diagnosis, it aims to predict what the MMSE score is after, for example, six months, 18 months, and two years.

Interesting questions in the external validation of such a model are:

  • Does that same model also apply to a Dutch GP environment, where you have a different source population, and have a different mechanism for capturing the data?
  • Does that same model also work in England in a cohort in DPUK? Does the same model also work in Spain, where you have again very different data and a very different source population and very different methods of recording data?

Those questions regarding how easy or difficult it is to take a model, generated in one setting, and test it in another setting, is a key in our current activities in WP4.

External validation, however, is a non-trivial issue. First, one needs to understand in sufficient detail the model itself and how it was created. Second, one needs to identify from the various data sources available in ROADMAP the sources that have the data that would allow validation of the model. We are now in the process of validating the model of Ron Handels in  different settings and we are looking at how that model performs when you take it from Sweden to from example the Netherlands, Spain, France or the UK.

What will be the next steps of the validation of the models during ROADMAP phase 1?

The most difficult thing in science – paradoxically – is not answering a question, but asking the right question. One main challenge faced when validating models is to understand why a model does not perform well.

We will test the models in different countries and different settings. How does this work in Spain? How does this work in a disease cohort in the UK?  Literature shows that when a model is externally validated the results may be disappointing. Paradoxically, our first task will probably not be to improve the model, but understand the reasons why a model does not replicate in external validation.

How will these feed into the second phase of ROADMAP?

I think there is a wide consensus that for Alzheimer’s disease there is a need for a good disease model.   I believe this drives ROADMAP and motivates the parties in ROADMAP. Furthermore, I think there is also agreement that we need to exploit the currently available data to make the model as good as we possibly can.

What is an unknown variable, and this is what we’re trying to explore in WP4 of ROADMAP, is how well the currently available disease models perform in external validations. Dozens of models have been proposed in literature – remarkably few have been subjected to proper external validation. Another unknown variable is whether the currently available data meet the requirements of a rigorous validation of the existing disease models.

We need to understand how we use and reuse all the work people have done until now. We need to understand what is available in the different countries in Europe. We need to appreciate the different settings in Europe. However, we also need to understand what is not available. Where are the white spots?  What data do we need to collect in addition to the already available data? What type of information do we simply not have at the moment?

The objective of this first phase of ROADMAP is in my view therefore twofold. First of all, reuse as good as we can, the currently available data to validate and improve our disease models. And, secondly, we need to identify the areas where further data collection is needed if we really want to pursue our endeavours in building and validating models. One of the possible conclusions of ROADMAP could be that for disease progression, we simply do not have enough data available from routine care. Maybe we then need to restructure how we collect data in routine care.

In the case of ROADMAP, the starting point is an inventory on what is available, to use what is available and to use that experience in identifying, so to say, the white spaces in the domain. We need to identify what we do not know, and that will be I think important input for ROADMAP phase 2.

More News