Press Release: Veritas Press C.I.C.
Author: Kamran Faqir
Article Date Published: 18 Sept 2025 at 13:37 GMT
Category: Science & Technology | Health | AI Tool Predicts Risks
Source(s): Veritas Press C.I.C. | Multi News Agencies
A major new study, published in Nature on 17 September 2025, reports on Delphi-2M, a generative AI model able to predict a person’s risk of more than 1,000 diseases, sometimes projecting health outcomes up to 20 years into the future.
Researchers from the European Molecular Biology Laboratory (EMBL), the German Cancer Research Centre (DKFZ), and the University of Copenhagen developed the model using large-scale medical data and machine learning techniques inspired by large language models (LLMs).
How Delphi-2M Works:
- Data used: It was trained on anonymised records from about 402,799 participants in the UK Biobank and validated (without changing parameters) on 1.9 million individuals in the Danish national health registries.
- Inputs: Alongside diagnoses (coded via ICD-10 codes), the model takes into account age, sex, body-mass index (BMI), smoking, alcohol consumption, and the sequence and timing of “medical events.”
- Architecture: Delphi-2M is based on modifying transformer-style models (similar to GPT) to handle health data, particularly to model temporal aspects, how disease risk evolves over time, depending on when events occur.
Performance, Accuracy, And Validation:
- For many diseases, Delphi-2M matches or beats existing single-disease risk models in predicting future incidence. For example, it performs similarly to established risk scores for cardiovascular disease and dementia in many settings.
- However, for certain conditions (e.g. diabetes), using established biomarkers like HbA1c remains more accurate in some contexts, especially for short-term prediction.
- When validated on the Danish dataset, accuracy dropped somewhat, but highly correlated disease-risk patterns were preserved, showing generalisability, albeit with reduced fidelity.
What Delphi-2M Adds:
- Scale & comprehensiveness: Whereas many prior tools focus on predicting risk for one disease (e.g. heart disease, cancer), Delphi-2M can forecast 1,258 disease states (at top ICD-10 levels) simultaneously.
- Long-term forecasting: It can sample “synthetic future health trajectories,” projecting disease burden and risks up to two decades ahead. This allows for estimations not only of if but when certain conditions might emerge.
- Explainability: The researchers used techniques like SHAP values (Shapley additive explanations) and examining embeddings to understand which past medical events most influence future risks. For instance, diagnoses within digestive-tract diseases significantly raise the risk of pancreatic cancer, which in turn raises mortality risk.
Expert Statements And Reactions:
- Ewan Birney, Interim Executive Director at EMBL, said:
“Our AI model is a proof of concept, showing that it’s possible for AI to learn many of our long-term health patterns and use this information to generate meaningful predictions.”
He added the promise that in the future, clinicians might tell patients, “Here are four major risks in your future and here are two things you could do to really change that.”
- Tom Fitzgerald, staff scientist at EMBL-EBI:
“Medical events often follow predictable patterns … Our AI model learns those patterns and can forecast future health outcomes.”
- Prof Moritz Gerstung, head of AI in Oncology at DKFZ:
“This is the beginning of a new way to understand human health and disease progression. Generative models such as ours could one day help personalise care and anticipate healthcare needs at scale.”
Limitations And Concerns:
While promising, the study and tool are not without limitations:
- Biases in data
- The UK Biobank cohort tends to be healthier, more affluent, and more heavily white than the general UK population, which may introduce selection bias or limit generalisability.
- Some disease diagnoses are missing in certain sources: hospital records vs primary care, vs self-reported data. The model has picked up on “data missingness” as a signal, which could lead to artefacts in predictions.
- Demographic & ethnic diversity
- Because certain ethnicities or socio-economic groups are under-represented, predictions may be less reliable for those groups. The study acknowledges this.
- Predictive decay over time
- Predictive accuracy reduces for longer horizons. The average AUC drops from ~0.76 for short-term predictions (within a few years) to lower values for 10- or 20-year horizons.
- Not ready for clinical deployment yet
- The study authors explicitly say Delphi-2M is a proof-of-concept, not yet fit for routine clinical use. More validation, assessment of risks & regulatory oversight required.
- Ethical, privacy and governance issues
- Risks of misuse by insurers or for discriminatory purposes are often raised when predictive risk tools become powerful and accessible.
- Ensuring patient data privacy, informed consent, transparency, and correct interpretation are key.
Implications & Possible Future Directions:
- Preventive care and early intervention: If clinicians could identify someone’s elevated risk for specific diseases years in advance, they might be able to offer tailored lifestyle changes, screening or preventive treatments well before symptoms or disease onset.
- Healthcare planning & resource allocation: At the population level, the model could help estimate disease burdens in future decades, which could inform public health policy, screening programmes, and resource distribution (e.g. hospitals, preventive programmes).
- Integration of additional data types: The paper suggests that the model might be improved by adding more layers, genomic data, biomarkers, imaging, prescription records, data from wearables, etc.
- Synthetic data generation: One promising aspect is that Delphi-2M can generate synthetic health trajectories preserving statistical patterns but without relying on any specific individual’s full record. That may help with privacy-sensitive research.
- Regulation, ethics, and oversight: For such a tool to be used responsibly, standards must be developed (if not already) for transparency, fairness, oversight, and accountability. Experts are already raising concerns about bias, how risk predictions may be used (or misused), and ensuring equitable benefit.
Latest News & Takeaways:
According to media reports:
- The model performed robustly even when applied to national registries in Denmark, without retraining, which suggests cross-system transferability.
- There is discussion around integrating genetic and protein data in future versions to improve scope and accuracy.
- Commercialisation potential is being considered. Some reports mention that the institutions involved have filed patents around key innovations.
Concluding Thoughts:
Delphi-2M is a landmark step towards more comprehensive, long-term, disease risk forecasting using AI. It offers a powerful proof-of-concept: that data on past health and lifestyle can be leveraged to project risk for many diseases simultaneously, and that the temporal sequencing of medical “events” matters.
However, as with any advanced AI in health, the gap remains between research proof and clinical utility, especially in ensuring fairness, generalisation to underrepresented populations, safe regulatory oversight, and protecting against misuse.
Tags:
Leave a Reply