Machine Learning Implementation of a Prediction Model for Heart Failure Using Flask and Heroku
Reporter: Aviva Lev-Ari, PhD, RN
Deploying a Heart Failure Prediction Model Using Flask and Heroku
Guest Author: Osasona Ifeoluwa
She Code Africa Cohort 3 Final project.
We published this article as an Educational Example for:
1. Designing a Prediction Model in Cardiology by using Data Created by other Authors
2. Using a Machine Learning Implementation for computation of the prediction values
3. Development of a Web Application to rest in the Public Domain
4. Usage of the Github repository
5. to be added by Adina Hazan, PhD
6. to be added by Adina Hazan, PhD
7. to be added by Buddhadeb Pradhan, PhD
8. to be added by Buddhadeb Pradhan, PhD
Cardiovascular diseases (which often leads to heart failures) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of global deaths.
Most cardiovascular diseases can be prevented by addressing behavioral risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity, and harmful use of alcohol using population-wide strategies. However, people with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidemia, or already established disease) need early detection and management wherein a machine learning model can be of great help.
This machine learning model could help in predicting mortality caused by heart failure by taking in important features from the dataset and making predictions based on these features.
The dataset consists of 12 variables/features, and 1 output variable/target variable. Let us examine the role of each feature in determining if a person is likely to have heart failure or not:
- Age: This is the age of the patient
- Anemia: is the decrease in red blood cells or hemoglobin
- Creatinine_phosphokinase: is the level of creatine kinase in the blood. This enzyme is important for muscle function.
- Diabetes: is a chronic disease that causes high blood sugar
- Ejection fraction: is the percentage of blood leaving the heart at each contraction
- High blood pressure: is blood pressure that is higher than normal
- Platelets: are tiny blood cells that help your body form clots to stop bleeding
- Serum creatinine: is the level of serum creatinine in the blood
- Serum sodium: is the level of serum sodium in the blood
- Sex: gender of the patient
- Time: This captures the time of the event
- Death event: which is the predictor variable.
Now that we know the function of each feature, Let’s get started
Step 1: Import Libraries
Step 2: Import the Dataset
The Dataset used in building this model was downloaded as a CSV file to my PC from Kaggle.
Step 3: Data Cleaning and EDA
This data was pretty much clean, so I didn’t have to do any more cleaning. However, some important pieces of information can still be explored.
Next, I use Matplotlib to visualize the distribution of the target variable (Death_event)
To check for the relationship between all the features and the target variable, I use a heatmap, which gives a graphical representation of the relationship between the variables.
Note: More analysis was done before Feature Selection, and details can be found on the Jupyter Notebook uploaded to Github.
Step 4: Splitting the Train and Test Data
Step 5: Data Preprocessing
This brings the data to a state that the model can parse easily. For the purpose of this project, the Standard Scaler is used, which standardizes the features by subtracting the mean and then scaling to unit variance.
Step 6: Model Selection
The support vector machine (SVM), a supervised machine learning model that uses classification algorithms for two-group classification problems is used. After giving the SVM model sets of the preprocessed training data for each category, they’re able to categorize new output.
The classification report shows an accuracy of 81%.
Since this model will be deployed, it is saved into a pickle file (model.pkl) created by pickle, and this file will reflect in your project folder.
Pickle is a python module that enables python objects to be written to files on the disk and read back into the python program runtime.
Step 7: Deploying with Flask and Heroku
Deploying a machine learning model means making the model available for end-users to make use of.
Create the Webpage
Here we will create a CSS webpage that has text boxes to take in input from users. The CSS file was named index.html and can be found here.
Several templates for creating a CSS webpage can be found online.
Deploy the model on the webpage using Flask
In deploying this heart failure prediction model into production, a web application framework called Flask is used. Flask makes it easy to write applications, and also gives a variety of choices for developing web applications.
To make use of this web application framework in deploying this model, we install Flask by running the following command:
Next, a Flask environment with an API endpoint that takes in the model and enables it to receive input from users, and return output is setup.
After this, a python file app.py is created, and the required libraries imported
Create the Flask App
Load the pickle
Create an app route to render the HTML template as the home page
Create an API that gets input from the user and computes a predicted value based on the model.
Now, call the run function to start the Flask server.
This should return an output that shows that your app is running. Simply copy the URL and paste it into your browser to test the app.
Deploy the Flask APP to Heroku
Heroku is a multi-language application platform that allows developers to deploy, and manage their applications. It is flexible and easy to use, offering developers the simplest path to getting their apps to market.
The first thing to do in deploying the Flask app to Heroku is to Sign up and Log In to Heroku. After which you can create a Procfile and requirement.txt file, which handles the configuration part in order to deploy the model into the Heroku server.
web: gunicorn is the fixed command for the Procfile.
The requirements file consists of all the libraries that have to get installed in the Heroku environment.
Next, you commit your code to Github and connect Github to Heroku.
After you connect, there are 2 ways to deploy your app. You could either choose automatic deploy or manual deploy. The automatic deployment will take place whenever you commit anything into your Github repository.
By selecting the branch and clicking on deploy, build starts.
After a successful deployment, the app will be created. Click on the view and your app should open. A new URL will also be created and can be shared by users.
Check Out my app via ‘https://heart-failure-prediction-app20.herokuapp.com/’
The link to the Github Repository can be found here
Dataset Authors: Davide Chicco, Giuseppe Jurman
Link to Dataset
This was my first machine learning Deployment project, and I hope someone finds this useful🙂.
SOURCE
Leave a Reply