In Identifying Periodic Google Trends, Part 1: Autocorrelation, I described how to calculate the autocorrelation statistic for Google Trend data using matplotlib. One the hacks that I found was required in order to calculate an informative autocorrelogram was to subtract the mean signal value from the original signal before running the calculation.

A more pathological situation occurs in the following case, using the Google Trends data for “run”:

Visual inspection of the original trend data suggests there is annual periodicity (note to self: learn how to add vertical gridlines at required points using matplotlib;-):

However, the autocorelogram does not detect the periodicity for two reasons: firstly, as with the previous cases, the non-zero mean value of the original time series data means the periodic excursions are attenuated in the autocorrelation calculation compared to excursions form a mean zero; and secondly, the increasing trend of the data adds further confusion to the year on year comparisons used in autocorrelation calculation.

Googling around *remove trend* and *matplotlib* turned up a *detrend* function that looked like it could help clean the data used for the autocorrelation calculation. In fact, the *detrend* function is mentioned in the *acorr* autocorrelation function documentation, although no details of values the function can take are provided there. However, searching the rest of that documentation page for *detrend* does turn up valid values for the function: *detrend=mlab.detrend_mean, and mlab.detrend_linear, mlab.detrend_none* where *import matplotlib.mlab as mlab*

If we set the detrend processor to *mlab.detrend_mean* we get the following:

And with detrend set to *mlab.detrend_linear* we get:

In each of these latter two cases, we see evidence of the 52 week correlation (i.e. annual periodicity).

FWIW, here’s the gist for the modified code.

### Like this:

Like Loading...

*Related*

## Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...
View all posts by Tony Hirst

## 3 thoughts on “Improving Autocorrelation Calculations on Google Trends Data”

Comments are closed.