Welcome Ninjas. This blog will look into the Augmented Dickey-Fuller Test, acommonly used test to measure whether a time series is stationary. So, before directly jumping to the primary concern of the blog, let us get started with what the stationarity of a time series means.
When a time series is stationary, the statistical properties of the time series, such as mean, variance, and correlation, are constant over time. One of the features of a stationary time series is that it is easy to predict. On the other hand, time series stationarity means there would be no trends or seasonal components.
What is the Augmented Dickey-Fuller Test?
Let's begin with the primary concern of our blog, the Augmented Dickey-Fuller Test. ADF test is a part of the Unit Root Test used to test the stationarity of a time series.
Unit root is one characteristic that makes a time series non-stationary. We have the following equation with us:
Where,
Yt - the value of the time series at time 't'
Xe - an exogenous variable
When the value of alpha =1 in the above equation, it is said that the unit root exists.
We conclude that the presence of unit root signifies that the time series is non-stationary, and the number of unit roots points to the number of differencing operations needed to bring stationarity in time series.
Algorithm behind Augmented Dickey-Fuller Test
We test the following hypotheses in the ADF test.
Null Hypothesis(H0): The time series is non-stationary
Alternative Hypothesis(H1): The time series is regarded as stationary.
When we implement the ADF test in Python or R, we get the following outputs:
The p-value
The value of the test statistic
Number of lags considered for the test
The critical value cutoffs.
The hypotheses testing is done based on the p-value that is received as output. If the p-value is less than a particular level(e.g., alpha=0.05), then, in that case, the null hypothesis is rejected. The time series is stationary in that case.
On the other hand, if the p-value comes out to be greater than a particular level, then the null hypothesis is accepted(we fail to reject it); hence the time series is not stationary in this case.
Augmented Dickey-Fuller Test for Time Series Analysis
R
Now, let's get started with implementing the Augmented Dickey-Fuller Test. It is very easy to implement in the R language.
Visualize the data. Before implementing the ADF test, let's plot it and look at it.
plot(tsd, type='l')
Now, we will perform the ADF Test using the adf.test() function provided by tseries library in R. For implementing the ADF test, it is important to import the tseries library & for that first install the package quadprog.
library(tseries)
adf.test(tsd)
Since the p-value (=0.4369) is greater than 0.05, thus we fail to reject the null hypothesis, we accept it. Hence, it implies that the time series is not stationary.
PYTHON
Let's see the implementation of ADF test in Python.
First, we need to import some required libraries as shown:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
%matplotlib inline
You can also try this code with Online Python Compiler
The presence of seasonality & trends is what makes a time series non-stationary.
Based on ADF test, how can we conclude that the time series is stationary?
ADF is a hypothesis testing test. If the null hypothesis is accepted, then the time series is non-stationary, but if the null hypothesis is rejected, it means that the time series is stationary.
What other test can be used to check the stationarity of time series data?
The Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test, a type of Unit root test, can also be used to test the stationarity of a time series data around the deterministic trend.
Conclusion
We hope this blog successfully helped you understand the Augmented Dickey-Fuller Test concept and how it can be implemented easily in R and Python.
If you found this blog interesting and insightful, refer to similar blogs: