In the recent issue of the Real World Economics Review there was a rather interesting, if somewhat dense, article by Judea Pearl and Bryant Chen entitled Regression and Causation: A Critical Examination of Six Econometrics Textbooks. Lars Syll, who was earlier sent the paper, has weighed in here. It is a heavy and technical paper, but I think that the underlying results are of great interest to anyone concerned with applying econometrics to economics — or, conversely, to anyone who is, as I am, skeptical of such applications.
The papers appears to turn on a single dichotomy. The authors point out that there is a substantial difference between what they refer to as the “conditional-based expectation” and “interventionist-based expectation”. The first is given the notation:
The difference between these two relationships is enormous. The first notation — that is, the “conditional-based expectation” — basically means that the value Y is statistically dependent on the value X. So, given a mass of past data points for the value Y and X we can then make a purely statistical prediction about the relationship between the two.
The reasoning runs something like this: “Since we know from past data that when the variable X changes by a given amount we then see a change in Y by a given amount, we can then assign a certain probability that such a relationship will carry into the future.”
The second notation — that is, the “interventionist-based expectation” — refers to something else entirely. It means that the value Y is causally dependent on the value X. This means that if we undertook an experiment in which we altered the value of X by some amount we would then see a fixed change in the value of Y.
All that may seem somewhat dense and confusing, so let us consider the example that the authors lay out (p3, footnote 3). They ask us to consider the case in which an employee’s earnings, let’s say X, are related to their expected performance, let’s say Y. Now, if we simply go out and take a statistical measure of earnings and expected performance we will find a certain relationship — this will be the conditional-based expectation and it will be purely a statistical relationship.
If, however, we take a group of employees and raise their earnings, X, by a given amount will we see the same increase in performance, Y, as we would expect from a study of the past statistics? Obviously not. This example, of course, is the interventionist-based expectation and is indicative of a causal relationship between the variables.
Now, what the authors of the paper find is that, when they survey six popular econometrics textbooks these differences are not adequately outlined at all. Indeed, the authors of the textbooks usually don’t even distinguish these two vastly different relationships by using different mathematical notation. The effect is that students confuse statistical relationships for causal relationships.
Obviously, this is deeply problematic from the point-of-view of applied economics. In economics we are mainly interested in causal rather than statistical relationships. If we want to estimate, for example, the multiplier, it is from a causal rather than a statistical point-of-view. Yet the training that many students receive leads to confusion in this regard. Indeed, we may go one further and ask whether such a confusion also sits in the mind of the textbook writers themselves.
This confusion between statistical relationships and causal ones has long been a problem in econometrics. Keynes, for example, writing his criticism of the econometric method in his seminal paper Professor Tinbergen’s Method noted that Tinbergen had made precisely this error.
In his book Tinbergen is trying to account for the fluctuations in investment using econometric techniques. But, as Keynes notes, such an approach does not account for causality at all — i.e. it cannot tell us the causal relations between the variables, but only the past statistical relations. This can clearly be seen in the fact that in the period of Tinbergen’s study the rate of interest varied little and this leads to the rather murky conclusion that the rate of interest is not having much of an impact on the rate of investment. Keynes writes,
For, owing to the wide margin of error, only those factors which have in fact shown wide fluctuations come into the picture in a reliable way. If a factor, the fluctuations of which are potentially important, has in fact varied very little, there may be no clue to what its influence would be if it were to change more sharply. There is a passage in which Prof. Tinbergen points out (p. 65), after arriving at a very small regression coefficient for the rate of interest as an influence on investment, that this may be explained by the fact that during the period in question the rate of interest varied very little. (p567)
Of course, this does not mean that the rate of interest had no potential causal relationship with fluctuations of investment. Rather it means that, due to the lack of substantial fluctuations in the interest rate in the period of observation, whatever effects it may or may not have had were simply not realised. Thus, from the given evidence we simply do not know.
Many would interpret Tinbergen’s results to say something like: “The rate of interest has very little effect on fluctuations in investment”. From a statistical point-of-view that statement is perfectly true for the period given. But from a causal point-of-view — which is what we as economists are generally interested in — it is completely hollow.
The question then arises: why, after over 70 years, are econometrics textbooks engaged in the same oversights and vaguenesses as some of the pioneering studies in the field? I think there is a simple explanation for this. Namely, that if econometricians were to be clear about the distinction between statistical and causal relations it would become obvious rather quickly that the discipline holds far less worth for economists than it is currently thought to possess.
Let me be clear about this: I am not saying that econometricians are engaged in some sort of conspiracy. I am not saying that they get together in smoke-filled rooms and conspire to bamboozle students into confusing statistical relationships with causal relationships. Rather I think that they succumb to this confusion themselves because they are trying to walk the line between being a statistician and being an economist.
The two disciplines are generally interested in entirely different forms of relationships and trying to locate the relationships that are of interest to an economist in the techniques of the statistician — which is what econometricians try to do — is probably a far more fruitless endeavor than popular opinion would think. So it is no surprise that in trying to mix oil and water — that is, statistical and causal inference — econometricians often come out with a terrible mess on their hands.
Indeed, it seems to me that such a mess actually provides the foundations on which the discipline rests. If it were ever cleared up sufficiently well, many would question the use of econometrics in economics altogether. Perhaps, for the econometrician, it is better to be confused than to be potentially unemployed.