COVID 19 Impact on the Airport → Data Analysis in R

Data

head(data) 
dim(data)
> 7247,11 #not good head visualization

Types of variables

str(data)

Important Variables (common sense)

Continuous Variables

Categorical Variables

Observations

This is time-series data.

We don’t know if we have information about all the airports, cities of the four countries. Thus, the movement of PecentofBaseline only matters along time in each of the different airports, cities, and countries given.

Exploratory Data Analysis

Univariate

ggplot(data, aes(x = PercentOfBaseline, fill = "red"))+geom_density(alpha = 0.2)

Bivariate

Continuous vs Categorical

ggplot(data, aes(x = PercentOfBaseline, fill = Country))+geom_density(alpha = 0.2)

Observations and Plans:

PercentofBaseline vs Time (country-wise)

ggplot(data, aes(x = Date, y = PercentOfBaseline, group = Country$, colour = Country)) + geom_line() +facet_wrap(~ Country)

Understanding

Modeling

Chile

lyr)
data_chile = data %>% filter(Country == "Chile")
chile = data_chile[,c(2,5)]
head(chile)

Convert into time-series data structure*

library(zoo)
z <- read.zoo(chile, format = "%Y-%m-%d")#zoo series for dates
time(z) <- seq_along(time(z))#sequential data as 1,2,3,4,...
ts_chile = as.ts(z) #conversion into time series data structure
head(ts_chile)
library(zoo)
library(tseries)#for removing na from ts()
z <- read.zoo(chile, format = "%Y-%m-%d")#zoo series for dates
ts_chile = as.ts(z)
ts_chile = na.remove(ts_chile)#remove na from time series
head(ts_chile)

Plot the time series

plot(ts_chile)

Decomposition

decompose(ts_chile)
This is happening because the data has no seasonal trend

Forecast

Model (ARIMA model)

library(forecast)
model = auto.arima(ts_chile)
model

ACF Plot

acf(model$residuals, main = "Correlogram")

PACF Plot

pacf(model$residuals, main = "Partial Correlogram")

Ljung Box Test

Box.test(model$residuals, type = "Ljung-Box")

Normality of Residuals

hist(model$residuals, freq = FALSE)
lines(density(model$residuals))

Forecast

forecast = forecast(model,4)#4 = number of units you want to 
library(ggplot2)
autoplot(forecast) #plot of the model
accuracy(forecast) #performance of the model
forecast #forecast values

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store