RiskCardio – A machine learning model that predicts cardiovascular deaths

machine learning

Cardiovascular disease (CVD) is a major cause of death and disability worldwide. According to a study conducted in 21 countries, fifty-three percent of Indians are likely to die from cardiovascular disease before their 70th birthday, compared with 23% Europeans.

The study found 14 risk factors account for the highest number of cardiovascular events such as heart attack, heart failure, heart stroke, and death by heart disease. They are hypertension, high cholesterol, ambient, and household air pollution, tobacco use, poor diet, diabetes, excessive consumption of alcohol, lack of physical activity, sodium intake, obesity, depression, etc. A significant number of risks can be averted if we make a few lifestyle changes.

In an effort to predict a patient’s risk of cardiovascular death, a team of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has created a machine learning model named “RiskCardio” which uses a patient’s raw electrocardiogram (ECG) signal to produce a score that places patients into different risk categories.

RiskCardio’s high-risk patients who come in the top quartile, are nearly seven times more likely to die of cardiovascular risks as opposed to the low-risk group. By existing risk metrics, high-risk patients are three times more likely to suffer an adverse event compared to their low-risk counterparts. Existing machine learning models attempt to estimate the risk by assessing patient information like age or weight. But RiskCardio uses the patients’ raw ECG signal, with no additional information.

According to the researchers, RiskCardio aims to improve the first step in estimating risk. For this, the system was trained using data from past patients, who survived an acute coronary syndrome (ACS) that refers to a range of conditions such as reduction or blockage of blood to the heart. Within the first 15 minutes of a patient experiencing an ACS, the system can estimate whether or not they will suffer from a cardiovascular incident within 30, 60, 90, or 365 days.

To build the model, the team first separated each patient’s ECG signal into a collection of adjacent heartbeats and assigned a label to each set of adjacent pulses, based on patient outcomes. For instance, heartbeats from patients who died were labeled “risky,” while heartbeats from others who survived were labeled “normal.” The team created a risk score given a new patient by averaging the prediction of the patient from each set of adjacent heartbeats.

The team found that in roughly 1,250 post-ACS patients, 28 would die of cardiovascular death within a year. Using the proposed risk score, 19 of those 28 patients were classified as high-risk. The future plan of the team is to make the dataset more inclusive of different ages, ethnicities, and genders. It also plans to use the model to examine medical scenarios with poorly labeled or unlabeled data, and assess how it processes and handles the information in more ambiguous cases.