MACHINE LEARNING IN EPIDEMIOLOGY
Thadomal Shahani Engineering College
Epidemiology is the study of distribution of health conditions (physical and/or mental), the factors
causing or affecting these conditions and the risk associated. This study is conducted over a defined
population. It aims to find and establish patterns in the spread of diseases in particular groups and
come up with solutions that will prove to be the most appropriate with respect to the nature of the
disease in question. Epidemiology plays a very important role in the formulation of health policies
by assessing the needs of a population. This is followed by delivering interventional procedures to
the population; which could be a drug, as in the case of polio prevention or a guidance system as
in the various awareness campaigns conducted for people. These activities make epidemiology an interdisciplinary field involving biostatistics, management, technology as well as policy making
The study and subsequent prediction is carried out using available techniques that include
biostatistical tools of measurement like proportion, rates and ratio. However, attributing to the
recent advancements in the field of Artificial Intelligence, constant efforts are being made to
implement technologies like Machine Learning (ML) for assisting in the prediction and prevention
of various negative health outcomes. This is done by processing and analysing data gathered from
the sample population.
The capability of ML to solve complex tasks with dynamic parameters and knowledge has
contributed to its popularity in the field of public health. Off late, Data Analytics is being included
as well. Data Analytics, when used in epidemiology, caters to the aspect where the data collected
is cleaned and organised; and imbalanced data is normalized. The use of highly precise
computational models for processing and performing the required operations to come up with the
required results have been in the talks for some time now. In ML models, the factors that cause/
affect a particular condition become the features that act as the independent variable which the
models take as an input to return the predicted value/class label.
All this while, lack of availability of large scale data was the main issue for testing these models.
In recent times, however, various online platforms that claim to help patients with self-assessment
by collecting their medical history and details, global surveys conducted that people willingly
partake in, numerous health and fitness tracking apps, etc. have led to an increased availability of
automated patient historical data. This has made it possible for Machine Learning and its
applications to be implemented for intelligent and improved systems of prediction. With the help
of the available data, Machine Learning algorithms like artificial neural networks, support vector
machines, Bayesian neural networks, decision trees and others can be employed to solve the given
problem statement. A great example is the case study by Carnegie Mellon University’s
Computational Data Science Labs called “Using Machine Learning for Epidemiological
Forecasting”, wherein the researchers have used machine learning to develop epi-forecasting
successfully for diseases like influenza and dengue.
Considering how the COVID- 19 pandemic has adversely affected all walks of life globally, the
need for a system to be in place cannot be more highlighted; a system that can predict situations
like these beforehand and in case of outbreaks, help with identifying and thus warning the
population which is most prone to infection. This identification can be done based on the data
available, using which, trained models can detect patterns in the spread of the disease and can give
meaningful insights regarding the people/community most affected by it. This helps to categorize
the patients and provide them with necessary care pertaining to the severity of their condition.
Coming up with containment strategies for the spreading disease and guidelines for the at-risk
patients will follow.
It is a great time to realize the vast potential of Machine Learning in the field of epidemiology and
to implement it in times of crises and otherwise. The ability of these models to become more and
more accurate with increasing data reinforces the idea of its importance. The agility of such systems to assist in coming up with relevant solutions is a very powerful tool that can be a gamechanger in the way epidemics and their aftermaths are dealt with.