Covid-19-Impact-WorldWide
MY FIRST PROJECT (EXCEL/SQL/TABLEAU)
Objective
Analyze Covid-19 and its impact globally from 2019 to Sep 2022.
GitHub link for the entire project.
DataSet
The dataset has been divided into different tables named Covid-Deaths and Covid-Vaccinations to work around SQL queries better for analysis.
Tools Used
PGAdmin — PostgreSQL Management Tool for the Analysis
The link below takes you to the SQL queries for the Data Exploration Part.
Select *
From Coviddeaths
— — — — — — — — Selecting some data to work upon
Select Location, dt, total_cases, new_cases, total_deaths, population
From Coviddeaths
Where continent is not null
order by 1,2
— — — — — — — — — looking at total cases against the population
— — — — — — — — — — the percentage of the population out of total
— — — — — -March 2020 was a major breakdown for India, jumped straight from 5 cases to 28 and the rest is the history
Select Location, dt, total_cases,population,
(total_cases::numeric/population::numeric)*100 as PercentagePopulationInfected
From CovidDeaths
Where location like ‘India’
and continent is not null
order by 1,2
— — — Looking at countries with Highest Infection Rate against the population
Select Location,population,MAX(total_cases) as HighestInfectionCount,
Max(total_cases::numeric/population)*100 as percentpopulationInfected
From CovidDeaths
Where location like ‘India’
and continent is not null
Group by location,population
order by PercentPopulationInfected desc
— — — — — — — — — — — — — - Total Cases vs Total Deaths
— — — — — — Shows the likelihood of dying if you contract covid in your country
— — — — — — there is a 1.18 % chance of me dying if I get covid in the month m doing this project.
Select Location, dt, total_cases,total_deaths, (total_deaths::numeric/total_cases::numeric)*100 as DeathPercentageFrom Coviddeaths
Where location like ‘India’
and continent is not null
order by 1,2
— — — — — — showing Countries with Highest Death Count per population
Select Location,
MAX(total_deaths) as totalDeathCount
From Coviddeaths
where continent is not null
group by location
order by totalDeathCount desc
— — — — — — — — — — — — — -Let's break things by continent
— — — — — — — Showing continents with the highest death count per population
Select continent, MAX(cast(Total_deaths as int)) as TotalDeathCount
From Coviddeaths
Where continent is not null
Group by continent
order by TotalDeathCount desc
— — — — — — — — — — — — — — GLOBAL NUMBERS
Select SUM(new_cases) as total_cases, SUM(new_deaths::int) as total_deaths,
SUM(new_deaths::decimal)/SUM(New_Cases)*100 as DeathPercentage
From Coviddeaths
— Where location like ‘India’
where continent is not null
— Group By dt::date
order by 1,2
— — — — — — — — — — - JOINING WITH COVID_VACCINATIONS
SELECT *
FROM covid_vaccinations
— — — — — — — — — — — Total Population vs Vaccinations
— — — — — Shows the Percentage of the Population that has received at least one Covid Vaccine
Select dea.continent, dea.location, dea.dt::date, dea.population, vac.new_vaccinations,
SUM(vac.new_vaccinations::integer) over (PARTITION BY dea.LOCATION ORDER BY dea.location, dea.dt) as RollingPeopleVaccinated
From Coviddeaths dea
Join Covidvaccinations vac
On dea.location = vac.location
and dea.dt = vac.dt
where dea.continent is not null
order by 2,3
— — —Using CTE to perform Calculation on Partition By in the previous query
With PopvsVac (Continent, Location, dt, Population, New_Vaccinations, RollingPeopleVaccinated)
as
(
Select dea.continent, dea.location, dea.dt, dea.population, vac.new_vaccinations
, SUM(vac.new_vaccinations::integer) OVER (Partition by dea.Location Order by dea.location, dea.dt) as RollingPeopleVaccinated
From Coviddeaths dea
Join CovidVaccinations vac
On dea.location = vac.location
and dea.dt = vac.dt
where dea.continent is not null
— order by 2,3
)
Select *, (RollingPeopleVaccinated::decimal/Population)*100
From PopvsVac
— — — — — -Using Temp Table to perform Calculation on Partition By in the previous query — the same thing as above
Create temporary table PercentPopulationVaccinated
(
Continent varchar(255),
Location varchar(255),
Dt date,
Population numeric,
New_vaccinations numeric,
RollingPeopleVaccinated numeric
);Insert into PercentPopulationVaccinated
Select dea.continent, dea.location, dea.dt, dea.population, vac.new_vaccinations
, SUM(vac.new_vaccinations::integer) OVER (Partition by dea.Location Order by dea.location, dea.dt) as RollingPeopleVaccinated
From Coviddeaths dea
Join CovidVaccinations vac
On dea.location = vac.location
and dea.dt = vac.dt
where dea.continent is not nullSelect *, (RollingPeopleVaccinated::decimal/Population)*100
From PercentPopulationVaccinated
— — — — — — — — — — Creating View to store data for later visualizations
Create View PercentPopulationVaccinated as
Select dea.continent, dea.location, dea.dt::date, dea.population, vac.new_vaccinations,
SUM(vac.new_vaccinations::integer) over (PARTITION BY dea.LOCATION ORDER BY dea.location, dea.dt) as RollingPeopleVaccinated
From Coviddeaths dea
Join Covidvaccinations vac
On dea.location = vac.location
and dea.dt = vac.dt
where dea.continent is not null
order by 2,3select * from PercentPopulationVaccinated
— — — — — — — — — creating another view
create view GlobalNumbers as
Select SUM(new_cases) as total_cases, SUM(new_deaths::int) as total_deaths,
SUM(new_deaths::decimal)/SUM(New_Cases)*100 as DeathPercentage
From Coviddeaths
— Where location like ‘%states%’
where continent is not null
— Group By dt::date
order by 1,2
Tableau for Data Visualization
To have the data for creating a dashboard. I have worked upon a selected SQL and shared its link below
/*
Queries used for Tableau Dashboard
*/
- — — — — — — — — — — -1. Global Numbers
Select SUM(new_cases) as total_cases,
SUM(new_deaths::int) as total_deaths,
SUM(new_deaths::decimal)/SUM(New_Cases)*100 as DeathPercentage
From Coviddeaths
--Where location like 'India'
where continent is not null
--Group By dt
order by 1,2
- — — — — — — — — — — — 2.
->We take these out as they are not included in the above queries and want to stay consistent
->European Union is part of Europe
->I have some additional information regarding incomes so removing it.
Select location,
SUM(cast(new_deaths as int)) as TotalDeathCount
From CovidDeaths
--Where location like '%india%'
Where continent is null
and location not in ('World', 'European Union', 'International','High income','Upper middle income','Lower middle income','Low income')
Group by location
order by TotalDeathCount desc
- — — — — — — — — — -3. InfectedPercentagge against population grouping as per location and population
Select Location,population,
MAX(total_cases) as HighestInfectionCount,
Max(total_cases/population)*100 as percentpopulationInfected
From CovidDeaths
--Where location like 'India'
--WHERE continent is not null
Group by location,population
order by PercentPopulationInfected desc
-- — — — — — — — — -4. InfectedPercentagge against population grouping as per location, population,dt
Select Location, Population,dt,
MAX(total_cases) as HighestInfectionCount,
Max((total_cases::decimal/population))*100 as PercentPopulationInfected
From CovidDeaths
Where location not like 'International'
Group by Location, Population, dt
order by PercentPopulationInfected desc
Below is the attached image of my Dashboard.
Link to the Tableau DashBoard.
Personal Note
So this was my very first project. Yes very intimidating initially I struggled with very basic things like taking 5–6 hours just to import my data files to SQL, understanding pgadminn interface, then of course the main part was analysis, I have felt like whatever I have learned so far has come into practice I experienced how SQL or tableau can make such huge data useful. I can write so much in this section somehow controlling.
Big Big Thanks to Alex the Analyst for this wonderful guided project. This was not possible without his guidance.
Learning is never-ending. I look forward to doing more and more projects and enhancing my skills. Would be happy to have references for more projects. Also, I am a huge fan of constructive feedback, feel free to drop any sort of suggestions with respect to my learning journey or even framing my medium stories to make them better.
The highest in me recognizes the highest in you✊