In the recent issue of the Real World Economics Review there was a rather interesting, if somewhat dense, article by Judea Pearl and Bryant Chen entitled *Regression and Causation: A Critical Examination of Six Econometrics Textbooks*. Lars Syll, who was earlier sent the paper, has weighed in here. It is a heavy and technical paper, but I think that the underlying results are of great interest to anyone concerned with applying econometrics to economics — or, conversely, to anyone who is, as I am, skeptical of such applications.

The papers appears to turn on a single dichotomy. The authors point out that there is a substantial difference between what they refer to as the “conditional-based expectation” and “interventionist-based expectation”. The first is given the notation:

While the second is given the notation:

The difference between these two relationships is enormous. The first notation — that is, the “conditional-based expectation” — basically means that the value *Y* is statistically dependent on the value *X*. So, given a mass of past data points for the value *Y* and *X* we can then make a purely statistical prediction about the relationship between the two.

The reasoning runs something like this: “Since we know from past data that when the variable *X* changes by a given amount we then see a change in *Y* by a given amount, we can then assign a certain probability that such a relationship will carry into the future.”

The second notation — that is, the “interventionist-based expectation” — refers to something else entirely. It means that the value *Y* is causally dependent on the value *X*. This means that if we undertook an experiment in which we altered the value of *X* by some amount we would then see a fixed change in the value of *Y*.

All that may seem somewhat dense and confusing, so let us consider the example that the authors lay out (p3, footnote 3). They ask us to consider the case in which an employee’s earnings, let’s say *X*, are related to their expected performance, let’s say *Y*. Now, if we simply go out and take a statistical measure of earnings and expected performance we will find a certain relationship — this will be the conditional-based expectation and it will be purely a statistical relationship.

If, however, we take a group of employees and raise their earnings, *X*, by a given amount will we see the same increase in performance, *Y*, as we would expect from a study of the past statistics? Obviously not. This example, of course, is the interventionist-based expectation and is indicative of a causal relationship between the variables.

Now, what the authors of the paper find is that, when they survey six popular econometrics textbooks these differences are not adequately outlined at all. Indeed, the authors of the textbooks usually don’t even distinguish these two vastly different relationships by using different mathematical notation. The effect is that students confuse statistical relationships for causal relationships.

Obviously, this is deeply problematic from the point-of-view of applied economics. In economics we are mainly interested in causal rather than statistical relationships. If we want to estimate, for example, the multiplier, it is from a causal rather than a statistical point-of-view. Yet the training that many students receive leads to confusion in this regard. Indeed, we may go one further and ask whether such a confusion also sits in the mind of the textbook writers themselves.

This confusion between statistical relationships and causal ones has long been a problem in econometrics. Keynes, for example, writing his criticism of the econometric method in his seminal paper *Professor Tinbergen’s Method* noted that Tinbergen had made precisely this error.

In his book Tinbergen is trying to account for the fluctuations in investment using econometric techniques. But, as Keynes notes, such an approach does not account for causality at all — i.e. it cannot tell us the causal relations between the variables, but only the past statistical relations. This can clearly be seen in the fact that in the period of Tinbergen’s study the rate of interest varied little and this leads to the rather murky conclusion that the rate of interest is not having much of an impact on the rate of investment. Keynes writes,

For, owing to the wide margin of error, only those factors which have in fact shown wide fluctuations come into the picture in a reliable way. If a factor, the fluctuations of which are potentially important, has in fact varied very little, there may be no clue to what its influence would be if it were to change more sharply. There is a passage in which Prof. Tinbergen points out (p. 65), after arriving at a very small regression coefficient for the rate of interest as an influence on investment, that this may be explained by the fact that during the period in question the rate of interest varied very little. (p567)

Of course, this does not mean that the rate of interest had no potential causal relationship with fluctuations of investment. Rather it means that, due to the lack of substantial fluctuations in the interest rate in the period of observation, whatever effects it may or may not have had were simply not realised. Thus, from the given evidence we simply do not know.

Many would interpret Tinbergen’s results to say something like: “The rate of interest has very little effect on fluctuations in investment”. From a statistical point-of-view that statement is perfectly true for the period given. But from a causal point-of-view — which is what we as economists are generally interested in — it is completely hollow.

The question then arises: why, after over 70 years, are econometrics textbooks engaged in the same oversights and vaguenesses as some of the pioneering studies in the field? I think there is a simple explanation for this. Namely, that if econometricians were to be clear about the distinction between statistical and causal relations it would become obvious rather quickly that the discipline holds far less worth for economists than it is currently thought to possess.

Let me be clear about this: I am not saying that econometricians are engaged in some sort of conspiracy. I am not saying that they get together in smoke-filled rooms and conspire to bamboozle students into confusing statistical relationships with causal relationships. Rather I think that they succumb to this confusion themselves because they are trying to walk the line between being a statistician and being an economist.

The two disciplines are generally interested in entirely different forms of relationships and trying to locate the relationships that are of interest to an economist in the techniques of the statistician — which is what econometricians try to do — is probably a far more fruitless endeavor than popular opinion would think. So it is no surprise that in trying to mix oil and water — that is, statistical and causal inference — econometricians often come out with a terrible mess on their hands.

Indeed, it seems to me that such a mess actually provides the foundations on which the discipline rests. If it were ever cleared up sufficiently well, many would question the use of econometrics in economics altogether. Perhaps, for the econometrician, it is better to be confused than to be potentially unemployed.

It is true that econometric textbooks could be clearer about defining and explaining causality, and from this perpective Pearl & Chen’s criticisms are valuable. But even in their current state, textbooks explain perfectly well that interpreting regression coefficients causally is unjustified without further assumption of exogeneity, and that alternative methods to obtain causal etimates (instrumental variables) are valid only if their identifying assumptions are true – and these asumption must be established outside of the model, based on theory or plausible source of exogenous variation.

That correlation is not always the same as causality is drilled to every econometrics student pretty much from day one. In fact, if somebody presented regression of performance on earnings and interpreted the result as causal effect, without acknowledging all the issues above, he would be laughed out of the room. Which is why current applied research spends a lot of time looking for natural experiments, dealing with selection bias or running randomized trials – but of course, that’s not something one can learn by rehashing 70-year old Keynes papers.

That’s the same defence that is always trotted out. It simply does not stand up. Econometrics studies are always extremely slippery on this point and it is, I think, for the reasons I highlighted above: it is built into the structure of the discourse.

Econometrics is a degenerate discipline that allows economists to engage in nonsense arguments that they bury in so much statistical garbage that they become extremely difficult, if not impossible to criticise. That is its function.

The “innovations” that you cite (natural experiments etc.) are not innovations at all. They are distractions used to placate econometricians who have become dimly aware that what they are doing is largely redundant and pointless.

Hey this is a really important topic! I think the blurring between causation and correlation exists in all rough sciences. Imagine if a textbook said “by the way, all the science here is based on correlation, so ultimately we don’t really know what is causing what” no one would really care to learn more except people who love economics. I think it’s done simply so students will accept the information in textbooks as fact rather than tentative speculation.

Yes, I somewhat agree. However, I don’t think that you need to throw economics away because of it. Although I do think that econometrics should be basically scrapped — and the same goes for its applications in other social sciences. Of course, the trend now is to INCREASE the amount of this research being produced which, to me, says a lot about the direction in which social science is headed: that is, into another Dark Age.

I also think econometrics has the appeal of seeming a more real, impartial science, rather than a value driven science (which most social sciences are) making it easier in a PC climate. For example, if I were to explain how to create a system to maximize personal wealth, I’ve already begun with an assumption that wealth is the main value a society should achieve which probably sets you up for a lot of mud slinging etc.

Yes, you’re absolutely right.

But I’d go further when it comes to economics. Most of mainstream economics is completely unrealistic nonsense with no grounding in reality. By using econometric techniques you can prove just about anything. This allows the nonsense to continue pouring out of the profession. If people were to look at the empirics directly most of mainstream economics would collapse.

So, there’s an added incentive to use it in economics.

I too think most economics isn’t very practical, but I’m pretty sure the average person thinks so too, especially during the financial crisis every economist’s interview was responded to with something more akin to a crazy man’s rambling than a solution to the problem. Economists seem to have brought little to the table during the last few years. Indeed economists like Krugman, whose ideas are as political as economic, have taken center stage, but to his credit his communication is clear, poignant and seems to lead to something.

Haha! Don’t mention the K word on this blog! Krugman has gone back on years of theory that he was espousing and started touting ideas that Post-Keynesians have been developing for over half a century. No accreditation, of course. Very frustrating!

This paper is great.

I also found this passage useful to understand the confusion between statistical and causal inference :

“The notion of ceteris paribus is sometimes used by economists and is closely tied to direct causation. If we hold all other variables fixed then any measured relationship between X and Y must be causal.” p.4

The ceteris paribus clause and the use of partial derivative to interpret β coefficients seem to force causality, even when there is not. And the interpretation that “when X increases by one unit, then Y increases by β units” does not help.

You touch upon an extremely important issue in your comments on Keynes/Tinbergen, Philip. I think it is exactly what Stanley Lieberson tries to say in his seminal “Making it Count”:

http://larspsyll.wordpress.com/2012/12/04/econometrics-and-the-difficult-art-of-making-it-count/

Mind much if I spread your blog around Philip?

Skippy… nice to hear your thoughts again imo.

Be my guest. Cheers.

Pingback: Tim Kelsey and His Campaign Against Shale Gas