Correlations in time series are sensitive to timescale

... and it's something that perhaps we don't look at quite often enough!

What am I talking about? Well, let's imagine that we're interested in the relationship between two signals, , and . One of the most basic analyses we might do is ask "are they correlated?" But perhaps the correlation depends on the timescale that we focus on. Could a signal be positively correlated at once timescale and negatively correlated at another scale?

Since I've shown you an example, hopefully you believe that a signal could be positively correlated at one timescale and negatively correlated at another! Here, and are positively correlated on a long timescale but negatively correlated on a short timescale.

Can we characterize this sort of relationship? Yes, and I'll outline one way of characterizing this sort of time-scale dependent correlation below.

How might such a relationship occur? Imagine  and depend on multiple factors, some of which they have in common. As an example, perhaps  and  are both positively affected by variable , and  is positively affected by  while  is negatively affected by .

Interestingly, in such a situation, and   might not even appear to be correlated at all!

Looking at the problem from the opposite perspective, perhaps we've observed some temporal dynamics of two variables   and  . Can we infer the existence of  and even if we didn't directly observe them? Perhaps, but only if X3 and X4 have different temporal characteristics!

Here, I'll go through an approach (detailed below) motivated by the Allan deviation, that here I'll call the Allan Covariance/Correlation or Allan Cross-Covariance/Cross-Correlation* (After writing this, I've found similar approaches under the term wavelet cross-correlation, of which this would be something like a Haar wavelet cross-correlation).

First, the Allan variance.

In short, the Allan variance is a metric for the variance of a signal as a function of the time-scale at which one looks at that signal. That is, the Allan variance of a signal is not a single number, but rather a function of timescale, taking a particular value at a particular timescale. This timescale is also called the gate time or averaging time, denoted τ, and I will denote the Allan variance as .

How does one calculate ?

  • First, average the data over intervals of .
  • Second, take the difference between the means of consecutive intervals.
  • Third, calculate the variance of the resulting signal, and divide by two in order to make it consistent with the definition of variance in the case of white noise.

Put more algebraically, we first calculate filtered version of the signal that includes only the stuff happening at a particular timescale. We'll call this filtered signal .

Then the Allan variance is simply , and the Allan deviation is .

How do we relate this to correlations on multiple timescales? Well, we will apply the same filter to both signals - first averaging the signal over a time window and then taking the difference between contiguous averages - in order to isolate the parts of that signal at that timescale! Then, we'll calculate the covariance, correlation, or cross-correlation (depending on your interest). Simple!

Here's an example in R. First, we'll dream up two signals, and that are positively correlated at low frequency () and negatively correlated at high frequency ().

x3 = sin(0.01*2*pi*t) # one period is 100 datapoints
x4 = sin(0.05*2*pi*t) # one period is 20 datapoints
x1 = x3 + x4 + rnorm(length(t), 0, 0.2)
x2 = x3 - x4 + rnorm(length(t), 0, 0.2)

plot(t,x1, type='l'); grid();
plot(t,x2, type='l'); grid();

This produces the signals as shown below.

A simple correlation doesn't detect any sort of relationship between the two, as is clearly evident by plotting the two against each other. While the cross-correlation suggests they may carry information about each other, it's pretty confusing - I am not sure how I would interpret a cross-correlation like this.

Next, lets calculate the Allan covariance and Allan correlation below:

allanCov = function(x1, x2, fs=1, corFunc=cov){
  taus  = unique(round(1.05^(0:(log(length(x1)/5)/log(1.05)))))     # evaluate at log-spaced timescales
  acor = vector('double',length=length(taus))  # initialize allan var vector
  for (j in 1:length(taus)){
    maxMultipleOfGateTime = taus[j] * floor(length(x1)/taus[j])
    x1_dec = diff(colMeans(matrix(x1[1:maxMultipleOfGateTime], nrow=taus[j])))
    x2_dec = diff(colMeans(matrix(x2[1:maxMultipleOfGateTime], nrow=taus[j])))
    acor[j] = corFunc(x1_dec, x2_dec)

acov = allanCov(x1, x2, corFunc=cov)
acor = allanCov(x1, x2, corFunc=cor)
par(mfrow=c(2,1), mar=c(4,4,1,1), mgp=c(2,1,0))
plot(acov$time, acov$acor, log='x', type='o', xlab='timescale', ylab='covariance', main='Allan covariance'); abline(h=0,lty=2)
plot(acor$time, acor$acor, log='x', type='o', xlab='timescale', ylab='correlation',main='Allan correlation'); abline(h=0,lty=2)

This produces the following output:

One can clearly see in both plots that at short timescales, the signals are negatively correlated, and at longer timescales they become positively correlated. I slightly prefer interpreting the Allan covariance over the correlation, since it's not normalized and hence tells us something about both the correlation and the magnitude of the spectral power at a particular frequency, but one could easily argue that the correlation is the better metric too. Up to you.

What about generalizing to the cross covariance, where we might allow signals to be lagged? That's easily done as well. Since sinusoids produce infinitely long cross-correlations, perhaps we can generate a new example, with a more biologically plausible signal. Let's consider two stochastic processes and , both generated by filtering white noise down. will be made by filtering white noise using an 11 point moving average filter (and therefore we might consider this as operating on a 'fast' timescale), and will be filtered using a 101 point filter (and hence will be 'slow'). We'll imagine that our two processes of interest and are functions of and , in the same way as I sketched out above.

We'll also add in a delay for 's response to where responds 50 datapoints

# generate filtered white noise as inputs to x1 and x2
x3 = na.omit(filter(rnorm(11000,0,1), rep(1/sqrt(101),101)))[t] # slow timescale input
x4 = na.omit(filter(rnorm(11000,0,1), rep(1/sqrt(11),11)))[t] # fast timescale input
x1 = x3[-(1:50)] + x4 + rnorm(length(t),0,0.2)
x2 = x3 - x4 + rnorm(length(t),0,0.2)

plot(t,x3, type='l', main='slow input (x3)'); grid();
plot(t,x4, type='l', main='fast input (x4)'); grid();
plot(t,x1, type='l', main='x1'); grid();
plot(t,x2, type='l', main='x2'); grid();

Again, there's only a weak correlation between and at the timescale of single datapoints.

And as before, we can clearly see that at short timescales, they're negatively correlated, and very slightly correlated at longer timescales.

However, this positive correlation at longer time scales is partly masked by the fact that at long timescales, they're offset temporally by 50 datapoints!  We should look into a cross-correlation-like metric, in which we will calculate the correlation between and for every possible time offset between the two signals.

allanCrossCorr = function(x1, x2, fs=1, type='covariance'){
  taus = unique(round(1.05^(0:(log(length(x1)/5)/log(1.05)))))
  acc = NULL
  allanKernel = function(k) c(rep(1/k, k), -rep(1/k, k)) # this is the Allan filter kernel
  for (j in 1:length(taus)){
    maxMultipleOfGateTime = taus[j] * floor(length(x1)/taus[j])
    x1_filt = na.omit(filter(x1, allanKernel(taus[j])))
    x2_filt = na.omit(filter(x2, allanKernel(taus[j])))
    cc = ccf(x1_filt, x2_filt, type=type, lag.max=100, plot=FALSE)$acf[,1,1]
    acc = rbind(acc, cc)
  data.frame(time=taus/fs, acc=acc)

accov = allanCrossCorr(x1, x2, type='covariance')
accor = allanCrossCorr(x1, x2, type='correlation')

colors = colorRampPalette(c('violet','blue', 'black', 'red', 'pink'))(100)
par(mfrow=c(1, 2))
image(x=accov[,1], y=-floor((ncol(accov)-1)/2):floor((ncol(accov)-1)/2), z=as.matrix(accov[,-1]), 
      col=colors, log='x', zlim=c(-1.3,1.3), main='Allan cross-covariance',
abline(h=0, col=adjustcolor('white', 0.3), lty=2)
image(x=accor[,1], y=-floor((ncol(accor)-1)/2):floor((ncol(accor)-1)/2), z=as.matrix(accor[,-1]), 
      col=colors, log='x', zlim=c(-1, 1), main='Allan cross-correlation',
abline(h=0, col=adjustcolor('white', 0.3), lty=2)

In both cases it's abundantly clear that at short timescales, the signals are negatively correlated (with no temporal offset), and at longer timescales, they are highly correlated at an offset of roughly -50 datapoints. Again, I find the covariance plot more revealing than the correlation plot, but this is really personal preference.


The take-home: Using an analogous approach to the Allan variance, we can see correlations at distinct timescales that would otherwise be obscured.



*Since this seems like one of most straightforward ways to tackle this problem, I can't help but assume that this approach has been suggested elsewhere. If that's the case, it has also probably been given a different name, since I found nothing when I searched for Allan covariance/cross-covariance. If you're aware of prior literature suggesting this exact approach, please let me know! I'm aware of other similar but distinct approaches like de-trended cross-correlation analysis.