Generative AI to spot blood cell abnormalities missed by doctors

The true value of healthcare AI lies not in approximating human expertise at lower cost, but in enabling greater diagnostic, prognostic, and prescriptive power than either experts or simple statistical models can achieve

Parashkev Nachev

In tests, CytoDiffusion could detect abnormal cells linked to leukaemia with far greater sensitivity than existing systems. It also matched or surpassed current state-of-the-art models, even when given far fewer training examples, and quantified its own uncertainty. “When we tested its accuracy, the system was slightly better than humans,” said Deltadahl. “But where it really stood out was in knowing when it was uncertain. Our model would never say it was certain and then be wrong, but that is something that humans sometimes do.”

“We evaluated our method against many of the challenges seen in real-world AI, such as never-before-seen images, images captured by different machines and the degree of uncertainty in the labels,” said co-senior author Professor Michael Roberts, also from Cambridge’s Department of Applied Mathematics and Theoretical Physics. “This framework gives a multi-faceted view of model performance, which we believe will be beneficial to researchers.”

The team also showed that CytoDiffusion could generate synthetic blood cell images indistinguishable from real ones. In a ‘Turing test’ with ten experienced haematologists, the human experts were no better than chance at telling real from AI-generated images. “That really surprised me,” said Deltadahl. “These are people who stare at blood cells all day, and even they couldn’t tell.”

As part of the project, the researchers are releasing what they say is the world’s largest publicly available dataset of peripheral blood smear images: more than half a million in total. “By making this resource open, we hope to empower researchers worldwide to build and test new AI models, democratise access to high-quality medical data, and ultimately contribute to better patient care,” said Deltadahl.

While the results are promising, the researchers say that CytoDiffusion is not a replacement for trained clinicians. Instead, it is designed to support them by rapidly flagging abnormal cases for review and handling more routine ones automatically. “The true value of healthcare AI lies not in approximating human expertise at lower cost, but in enabling greater diagnostic, prognostic, and prescriptive power than either experts or simple statistical models can achieve,” said co-senior author Professor Parashkev Nachev from UCL. “Our work suggests that generative AI will be central to this mission, transforming not only the fidelity of clinical support systems but their insight into the limits of their own knowledge. This ‘metacognitive’ awareness – knowing what one does not know – is critical to clinical decision-making, and here we show machines may be better at it than we are.”

The researchers say further work is needed to make the system faster and to test it across diverse patient populations to ensure fairness and accuracy.

The research was supported in part by the Trinity Challenge, Wellcome, the British Heart Foundation, Cambridge University Hospitals NHS Foundation Trust, Barts Health NHS Trust, the NIHR Cambridge Biomedical Research Centre, NIHR UCLH Biomedical Research Centre, and NHS Blood and Transplant. The research was conducted by the Imaging working group of the BloodCounts! consortium, which aims to use AI to improve blood diagnostics globally. Simon Deltadahl is a Member of Lucy Cavendish College, Cambridge.