COVID-19 took much of the world by surprise when it started spreading its deadly fingers around the globe in early 2020, leaving countries scrambling to respond and almost 7 million dead in its wake.
Governments, health care facilities, and personal protective equipment (PPE) makers were caught unawares, exacerbating a global outbreak that would have been much worse if it hadn’t been for the scientists who were able to turn around multiple vaccines in months rather than years.
Experts at Scripps Research and Northwestern University are now saying that it will also be science – this time in the form of AI and machine learning – that will ensure the next pandemic will not be able to ambush us like the last one did.
The scientists have developed an AI application that they are calling an early warning system for viral pandemics, one that could give scientists and medical experts a heads up about new variants that could become a problem, and a much longer lead time to address a coming health crisis. They’ll see a pandemic coming sooner than they did with COVID-19.
“There are rules of pandemic virus evolution that we have not understood but can be discovered, and used in an actionable sense by private and public health organizations, through this unprecedented machine-learning approach,” said Will Balch, professor in the Department of Molecular Medicine at Scripps.
Rob Enderle, principal analyst with The Enderle Group, called the research “very significant,” especially for a rapidly evolving world.
“We live in a time when we have a lot of opportunities for new viruses,” Enderle told Techstrong.ai “Climate change is forcing animals that normally don’t contact humans into close contact, the melting ice is exposing viruses that died out thousands of years ago but still potentially remain viable, and our research in improperly secured labs is creating opportunities for scientifically created viruses to escape.”
Given all that, “we need a far more aggressive way to identify and analyze the life of these viruses so that remedies and cures can be developed far more rapidly in order to minimize the spread and, if the virus is deadly, lower the related death rates,” he said.
Using the Past to Predict the Future
The scientists developed a system that was trained on a massive amount of publicly available data about SARS-CoV-2 variants – specifically the genetic sequences of the variants found in infected people around the world and the frequencies of the variants – and the mortality rates from COVID-19 itself. SARS-CoV-2 is a strain of the coronavirus that causes COVID-19.
They then applied a method called Gaussian process-based spatial covariance to the data. The technique essentially crunches huge amounts of existing data – including the relationships between them – to make accurate predictions for the future.
In this case, the researchers used machine-learning strategies on the existing data from the pandemic to track genetic changes in the SARS-CoV-2 variants that they said trended towards expanding rates of viral spread and decreased mortality brought on by the virus adapting to such measures as vaccines, increasing natural immunity, mask wearing, and lockdowns and competition among the variants themselves.
“We could see key gene variants appearing and becoming more prevalent as the mortality rate also changed, and all this was happening weeks before the VOCs [variants of concern] containing these variants were officially designated by the WHO [World Health Organization],” Balch said.
Creating EWAD
The end result is that the SARS-CoV-2 tracking system – or early warning anomaly detection (EWAD) system – can detect anomalies for gene variants tied to significant changes in viral spread and mortality rates. Essentially, by understanding the “rules of pandemic virus evolution” and applying them to the variants, the system can predict a possible pandemic before groups like the WHO.
“EWAD can anticipate changes in the pattern of performance of spread and pathology weeks in advance, identifying signatures destined to become VOCs,” the scientists wrote in their paper. “GP [Gaussian process]-based analyses of variation across entire viral genomes can be used to monitor micro and macro features responsible for host-pathogen balance. The versatility of GP-based SCV defines the starting point for understanding nature’s evolutionary path to complexity through natural selection.”
Balch said the project illustrated the importance of including not only a few of the prominent variants – like Alpha, Delta, and Omicron – but also the “tens of thousands of other undesignated variants, which we call the ‘variant dark matter.’”
Apply This to Other Use Cases
He also noted that while the computational method was used in this case with the COVID-19 pandemic, it can be applied to any genetic testing effort using publicly available data.
Enderle said experts have been applying AI methods to scientific and healthcare challenges for years. He noted that IBM’s Watson AI platform started out in medical research and, on edge cases that involve unusual diseases, Watson could cut the time to diagnosis from years to hours.
“And this was two decades ago,” he said. “This technology has advanced massively since then and AI is a darling for hospital administrators now … so it tells me that people are starting to truly explore the power of these new AI platforms.”