Just doing some quick house-cleaning on some previous posts. When I ran regressions plotting the money supply against the CPI I was told that I should average them over 5 year periods because this would supposedly iron out ‘volatility’ and show long-run trends. I responded in a post saying that I was dubious of this practice because I did not believe it would show long-run trends at all. Rather I thought it was being used to screw with the data. Here is a concrete example, drawing on the M3 and the CPI, showing why it is wise to be on your guard against these sorts of aggregation.
Here is the M3 money supply mapped against the CPI from 1960 until 2005 (I previously thought this data was available only from 1981 but I dug a bit more in FRED and found this longer dataset):
Looks good, right? Nice upward-sloping trend, right? Not really. Check that R-squared. It’s quite low. What might that be hiding? Well something rather serious actually. You see we had VERY different relationships between these two variables from 1960 to 1985 than we did from 1985 to 2005. The 1960-1985 results skew the other results. Here are the 1960-1985 results:
Wow! Now that is a tight relationship! Really statistically significant! If we accept this methodology of averaging to show long-run trends then this might have some purchase. But let’s maintain some healthy skepticism rather than dancing for joy just yet. After all, we wouldn’t just want to find what we wanted to find would we?
You see, in the period 1985-2005 this relationship went completely in the other direction. Here is the data plotted from 1985-2005.
What does all this show? It certainly shows that between 1960 and 1985 there were positive correlations between the M3 money supply and the CPI in the US when we average both of those variables over 5 years. But it also shows that between 1985 and 2005 that relationship turns negative — and quite strongly so**.
The typical complaint is that the latter two regressions do not have enough observations. This is ironic coming from the same people who told us to run the so-called long-run averages in the first place as these significantly reduce the observations. But it is also highly misleading. Clearly we have two entirely different historical time periods here. When we lump them together what we get is that one time-period dominates the other and we do not get a nuanced, useful picture.
While I’d typically agree that you want as many observations as possible you can’t just throw them in the proverbial statistical blender. This can allow certain periods — or even certain datapoints — to dominate others and this can lead to a fake homogeneity in the dataset. This is one of the many dangers of averaging and aggregating.
It is a bit like if we had two countries. One had short people and the other had tall people. Then we add the average heights together from both countries and say that on average the two countries have medium height people. But this is highly misleading. In fact, the relevant information is that one country has short people and the other tall people. Ditto in the above. The relevant observation for anyone really interested in the data and not just interested in proving their pet thesis is that the monetarist correlation held in one period and ceased to hold in the second.
What can we draw from this exercise? A number of things actually. First of all, averaging statistics or taking big long-run averages can produce misleading results. Imagine we had gone with the first chart. Well, we would predict that in the next 5 years the 5 year average of the M3 would be correlated with the 5 year average of the CPI. But that was only due to a spurious aggregation. If we were forced to place a bet we would be on far better ground with the last chart plotting a negative relationship in the years running up to the most recent observation.
Seriously think about this. If I put a gun to your head and said that you had to bet your life on a horse race and you were only allowed to bet on a single horse over and over again until he won or lost (and, for the sake of argument, you could not change your bet after the exercise had started). Then I gave you data that showed that while he typically lost races in the first leg of his career which was two and a half years, he typically won them in the second leg which was two years. Would you go and calculate an average, show that he lost more than he won and stake your life on him losing? If you did you’d be a fool. Clearly something had likely changed in the physiology or the training of the horse in the meantime. Now if you had the time you might investigate what this change might have been. That would certainly be the research question that the data raises.
This ties into the second question: what do these results mean? What they indicate is that the relationship between M3 and CPI is non-homogenous over time. It changes. And, what’s more, it changes drastically. In one twenty-five year period it is positive. In the twenty year period that follows it is negative. This further leads to a provisional conclusion: it might be that heterogeneity is inherent in most economic statistics. But that is scary to the mainstream because they believe in timeless economic laws. In order to generate these laws they engage in methods like the spurious aggregation we see above.
The truth is, of course, that economic data is non-ergodic and the future does not reflect the past. But that implies that economists cannot formulate timeless laws. And that makes mainstream economists, who are typically inflexible and against true empirics in this regard, very sad indeed. Poor mainstream economists.
** The positive relationship begins to break down in the post-1985 period. But it really comes unwound in the post-1990 period. If we run regressions from 1960-1990 and from 1990-2005 we get interesting results which further buttress our findings. The 1960-1990 relationship remains positive but the R-squared drops from 0.81 to 0.76. Meanwhile, the 1990-2005 negative correlation increases dramatically and becomes extraordinarily well fitted, with an R-squared of 0.97. Almost a perfect fit. This, of course, to some extent represents the smaller number of variables being used. But it still shows the dynamics of the change in the underlying nature of the data.
Of course, this is all rather silly from another perspective. If we just plot the two datapoints together on a simple line graph we can clearly see when the correlation changed by eyeballing the chart. But then we will be accused by our mainstream friends of being ‘primitive’ and so forth. This ignores the fact that people working with data for a living who actually have to get things right and are not allowed the plush luxuries of speculation in academia use such eyeballing techniques all the time to great effect. Undertaking an exercise such as the above with open eyes shows you exactly why this is.
As always, thanks for the post! I see where you are coming from regarding averaging time series data. I am far from an expert on time series analysis but I was wondering what your thoughts are regarding differencing in order to achieve a stationary time series? Would you say that if the time series in non-stationary we should not attempt to make it stationary in order to save useful properties of various statistical techniques?
Also, again forgive me for my ignorance on the subject, but I came across something interesting regarding lags. You had mentioned that you find the use of lags (at least how they are chosen) to be very arbitrary. I read the following in Godley and Lavoie’s Monetary Economics: “The short-run determination of macroeconomic variables is one among several steps of a dynamic sequence. These intrinsic dynamics must be distinguished from lag dynamics, which are involved with the passage of time. These lags insure that causes precede effects, so that we keep the time-sequence right and understand the processes at work. Lags, even small ones, are required to avoid telescoping time (Hicks 1965:66-7), and they will be extensively used in our models – more so than in Tobin’s own models.” (Godley and Lavoie, 2006, p.13). Can you provide a little more insight into what exactly is going on with these “lags”?
Well, Nick, I’m far from an expert in this field either. When it comes to statistical technique I’m graduate-level trained and know the basics you need to know to use them. But I like to think that I take a very granular approach when examining statistics and I have never been convinced that this is not the way to go.
Besides, most economists who use these methods are not experts either. This really must be remembered. What should also be remembered is that actual statistical experts may not be very good economists and they may not be used to handling the type of data that we economists deal with. All I’m saying is that we should be very careful throwing around claims of expertise in these fields.
Turning a non-stationary time-series into a stationary one is very similar to averaging, so far as I can see. It seems to suffer from the same problems that I raised above: in a non-ergodic/heterogenous environment it risks “ironing out” any interesting divergences from trend and dismissing these as “volatility” or something similar. Generally these approaches seem to carry dangers — and I’m not the only one saying this, I believe that N.N. Taleb has got very popular basically saying this same thing over and over again in a more mathematically sophisticated way. But I would not have the hubris to say categorically: “you cannot do this”. Obviously, you cannot use non-stationary processes for some purposes. I’m using Y-on-Y percentage changes of the variables in the above, not absolute values because the latter would be silly. Rather I would say: “can you justify what you are doing to yourself completely and totally? Are you sure you are not ironing out points of empirical interest to get a nice fit?”. It’s all about the particular case in question, I suppose.
On the Godley and Lavoie quote I’d have to look into it. It does look interesting though. I might look into it tomorrow or later on if I get the chance.
Thanks for your thoughtful response, I appreciate your time. It reminds me of a story told by Steve Keen about attending a conference (I believe at the Fields Institute) where the mathematicians basically said, what do you mean the economics is all wrong? We thought (incorrectly) the economists had that part figured out.
In your example of the two countries, I think most good econometricians would use a Chow test to see whether you could pool the two countries. Similarly, many time series econometricians these days use structural break tests (Bai and Perron or CUSUM for example) to test whether you can pool data over time, or use time-varying parameters VARs to similar effect. Their results conform to what you said about horse racing, if you forecast using data from prior to a break it works horribly. I consider some of these people heterodox (Frydman and Goldberg’s imperfect knowledge economics for example), though you may consider them mainstream. I agree though that far too few studies actually check for this.
I tried to make this point in another post of yours, though you said I was looking for timeless truths, even though I mentioned change over time on a couple of occasions. Admittedly I did not make my point well and the links I posted weren’t really connected to that point.
I am a bit surprised you are testing monetarism using regression analysis though, as you stated in an earlier post that those who try to test theories in this way are of lesser intelligence (maybe I am missing something). I also don’t think these regressions have much if any meaning because of the issue of non-stationarity raised earlier. Even in the sub samples where you find a connection, it could be a spurious regression if you don’t test for cointegration (though this possibility does not fully undermine the point you are making that the two are not related, or at least the all else equal assumption is too strong in this case). I agree though that the Austrian view of the QTM is very naive. Keep up the good work overall.
(1) Yes, I am aware there are techniques to test for these — although I’m no expert. You can even just eyeball the data prior to testing it and you’ll see any ‘breaks’ (queue some idiot saying that your method is ‘primitive’). But the problem is that economists are generally NOT looking for heterogeneity. They are looking for homogeneity because they believe that the world is full of timeless economic laws (apologies if I mis-attributed this position to you). The meta-problem I have with econometrics is the same as Keynes had: it can create fog. Chow tests may ensure that people don’t engage in what I call ‘spurious aggregation’. But when people do engage in such spurious aggregation it can be VERY difficult to untangle.
For example, when a commenter on here the other day plopped a massively aggregated scatterplot in my comments section as evidence for monetarism I chuckled and said I wouldn’t swallow it. Then he took this as proof that I was selectively picking evidence. I’m not a huge fan but McCloskey is right in this regard: used by economists, econometrics is a rhetorical device first and foremost which is why I typically avoid it.
(2) I agree fully that by my methodology you cannot prove or disprove an economic hypothesis. This is because they may hold good some of the time and may not hold good at other times. You also have the problem of causality. For example, the correlation between five-year averaged M3 and CPI from 1960-1985 above was likely due to the increase in prices pulling money out of the banking system. From 1985-2005 M3 probably expanded due to increased financialisation. Thus the money flowed into the finance sector and was not a reflection of expanding aggregate demand. Hence, there could be no effect on the CPI because the money was likely not being used to buy goods and services counted in the basket. This ties into what you are saying about cointegration and so forth because now we are starting to talk about causation rather than correlation.
So, what was the point of all this? Basically this: the mainstream holds a bunch of ‘laws’ that are supposed to be timeless. But even a cursory glance at the statistics shows that we cannot even establish correlation in many cases, let alone causation. The point of these exercise is more to show just how impoverished mainstream economics has become and how rotten their internal debate has become. They don’t deal with empirics. And when they do it is typically a type of empirics that is extremely dodgy.
Thanks for the quick reply. I agree wholeheartedly with all of your points. All the best.
I would take a look at this link.
Jesus! See what I’m talking about?
And I’ll bet the guy who wrote this considers himself something of a statistician!
This is statistics-blender economics. It’s unbelievable. The worst part is they hide behind a crude version of the “I just took a large sample” defence.
You know that this is how Reinhart and Rogoff cooked the books on austerity too, right? The mistakes in the spreadsheet were small beer in comparison to their spurious aggregation.
As an economics noob,
I would also like to point out the following in the above linked post- “indeed the inflation-money-growth combinations for some periods for some countries had to be eliminated from the scatter plot because the magnitudes of inflation and money growth were so large that they would dominate the plot—the scale would be such that the inflation-money-growth combinations for most of the other countries and periods would appear as a small dark mass of overlapping data points in the bottom left corner of the figure. And many countries had to be excluded from the sample because their inflation rates and money growth rates were so high that useful data were not recorded. The necessity for such omissions, of course, strengthens the conclusion that money growth and inflation were related—in all cases, high inflation was associated with high money growth.”
Is it just me or this, alongwith the ‘four ten year period’ aggregate, also smells fishy? Anyone?
I thought it would be interesting to share. I appreciate you taking a look at the link!
Boom! Pow! OUCH! That’s what this blogpost sounds like!
Thanks very much for your efforts in drawing back the veil on this Mr P. The main thing I’m wanting to know is whether the difference between the two periods reflects the varying abilities of the 1% and the 99% (simplified) to garner the rewards as Mr Picketty has argued?
Yes, I think so. The rise in money supply in the 1990s is probably a reflection of higher levels of financial activity.
As an economics noob, I didnt quite understand how you linked the two together. Can you please explain? 🙂
Ah, never mind. I think I roughly got it