TimeSeries Part 2: Python Statsmodels Library

Prakhar S
4 min readNov 29, 2021

--

Photo by Frédéric Barriol on Unsplash

In this article about TimeSeries Data, we will discuss Pythons Statsmodels library and how it can be used to explore and analyze time-series data. The jupyter notebook for this blog can be found here.

First, let's explore some concepts related to TimeSeries Data:

Trend

Any kind of pattern observed in the data. A time-series data can have an upward, a downward or a horizontal/stationary trend.

Image source: https://towardsdatascience.com/time-series-in-python-part-2-dealing-with-seasonal-data-397a65b74051

Seasonality

Any kind of repeating trends in the time-series data.

Seasonality Source: https://robjhyndman.com/hyndsight/cyclicts/

Cyclicality

Trends with no set patterns.

Cyclicality

Stationarity

A time-series is said to be stationary if it does not display any trends or seasonality. In the figure, the first series does not have an upward or downward trend, nor does it display any seasonality. One more way of defining stationarity is that it is when data does not have any time-dependent mean, variance or covariance.

Hodrick-Prescott Filter

Separates a time series into a trend component and a cyclical component. Find more here.

A parameter lambda needs to be specified, and as a thumb rule, the value is taken to be 1600 for quarterly data, 6.25 for annual data and 129600 for monthly data.

Example Usage :

Load the data into a dataframe :

Using lambda of 129600 as this a monthly data, we import ‘hpfilter’ from the Statsmodel and plot the cyclical and trend component.

ETS decomposition

ETS (Error, Trend, Seasonality) decomposition, breaks down a time-series into a trend component, a seasonality component and an error(residual) component. While performing ETS decomposition, we need to specify if the model is ‘additive’ or ‘multiplicative’. A model is said to be ‘additive’ if it is increasing or decreasing at a linear rate. If the rate of increase is non-linear, we choose ‘multiplicative. In the above trend, we can see that peaks are becoming higher each year, which seems to indicate this is a multiplicative model.

In python, using the statsmodels library we can perform ETS decomposition as below:

ETS decomposition using Statsmodels.

Holt-Winters Method

Holt-Winters method provides a triple exponential smoothing for level, trend and seasonal components. It has three sets of parameters: alpha, beta and gamma. Alpha specifies the coefficient for the level smoothing. Beta specifies the coefficient for the trend smoothing. Gamma specifies the coefficient for the seasonal smoothing. There is also a parameter for the type of seasonality: Additive seasonality, where each season changes by a constant number. Multiplicative seasonality, where each season changes by a factor. For more details on the concept, refer to this article. Here I will show the simple implementation of the method using the statsmodels library.

Simple Exponential Smoothing

Simple Exponential Smoothing

Double Exponential Smoothing (Holts Method)

In the above chart, the ‘green’ line for the Double Exponential Smoothing fits the original time-series quite perfectly, as can be seen from the chart.

Triple Exponential Smoothing (Holt-Winters Method)

Triple Exponential Smoothing or Holt-Winters Method

In this case, the Double exponential smoothing performs better than Triple Exponential smoothing in fitting the data.

ACF and PACF Plots

Statsmodels gives us ready-to-use functions to plot both ACF and PACF plots, which can then be used for building ARIMA models.

ACF and PACF plots

AD Fuller Test

AD Fuller Test helps us in checking the stationarity in a time-series. It gives us a p-value output, which can be used to decide if the null hypothesis (Data is non-stationary) can be rejected or not.

AD Fuller test

Month Plots and Quarter Plots

These plots provide us with a better overview of the seasonality in the time-series.

Month Plot
Quarter Plot

In the next article in this series, we will explore some forecasting methods for time-series data.

Thanks for reading. Your comments and suggestions are welcome.

--

--