Skip to content

kilofrakh/Covid-19-Death-Rate-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation


COVID-19 Data Analysis and Visualization

This Python script analyzes and visualizes COVID-19 data from the Worldometer dataset between April 18 and May 18, 2020. It performs calculations like case fatality rates (CFR) and tests-per-positive-case ratios, then visualizes this data using various plots.

Table of Contents

Project Overview

The code is designed to:

  • Filter and extract COVID-19 statistics for specific countries or dates.
  • Calculate important metrics such as the case fatality ratio and tests per positive case.
  • Plot histograms and scatter plots to visualize trends in COVID-19 death rates and testing quality across countries.
  • Highlight specific countries in the visualizations.

Dataset

The dataset used in this script is a CSV file named worldometer_snapshots_April18_to_May18.csv. The file contains:

  • COVID-19 data for multiple countries.
  • Daily snapshots of case numbers, deaths, tests conducted, and populations.

Setup

Prerequisites

To run this project, you need to install the following Python libraries:

pip install numpy pandas matplotlib

File Structure

  • worldometer_snapshots_April18_to_May18.csv: CSV file containing COVID-19 data.
  • covid_analysis.py: The main script for analyzing and visualizing the data.
  • README.md: This file.

How to Run the Code

  1. Ensure that you have the dataset (worldometer_snapshots_April18_to_May18.csv) in the same directory as the script.
  2. Run the Python script:
python covid_analysis.py

The script will display several visualizations, including histograms and scatter plots, in pop-up windows.

Code Explanation

Imports

  • numpy, pandas: Used for data manipulation and numerical calculations.
  • matplotlib: Used for plotting graphs.
  • datetime: Used for handling date formatting and filtering by date.

Data Loading and Preprocessing

  • The dataset is loaded using pandas.read_csv.
  • The script filters data for a specific country (USA) and a specific date (18/05/2020).

Case Fatality Ratio Calculation

  • The case fatality ratio (CFR) is computed as:

    Case Fatality Ratio = Total Deaths / Total Cases

Visualizations

  • Histograms: Plots showing the distribution of death rates across countries.

  • Scatter Plot: Shows the relationship between testing quality (number of tests per positive case) and death rates, with marker size representing the population and color representing the total number of deaths.

    Specific countries are highlighted on the scatter plot for better context.

Key Variables

  • min_number_of_cases: Used to filter out countries with fewer than 1000 cases.
  • x_axis_limit: Used to limit the x-axis in the scatter plot for better readability.

Conclusion

The script also calculates the overall death rate for countries with "good testing" (defined as having more than 50 tests per positive case).

Visualizations

  1. Histogram of Death Rates: Distribution of death rates (case fatality ratio) for all countries.
  2. Histogram of Greatly Affected Countries: Same as above, but limited to countries with more than 1000 cases.
  3. Scatter Plot: Visualizes the relationship between testing quality and death rate, highlighting certain countries.

License

This project is for educational purposes and is open for modification and reuse. No license specified.


This README.md file provides an overview of the project, how to set it up, and explains what the code does step-by-step. Feel free to adjust it as needed!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages