Lars Syll ran an interesting piece today on the “confounder” problem in econometrics. This is basically the problem of how do we tell that if there is a relationship between, say, A and B that it is not, in actual fact, being caused by an entirely separate variable, C.
The problems raised by this are innumerable. Syll’s main point is that in order to test for this we must know all possible C’s; C1, C2, C3… Cn etc. That means that we must control for every variable that we want to prove is not causing the correlation. Assuming then, that we actually manage to isolate every possible “confounder”, we can then say that there is a causal relationship between A and B.
This raises all sorts of problems. As Syll points out the main one is that it is a somewhat circular argument: if we convince ourselves that we know every possible confounder — that is, every possible C — we have, in a strong sense, already convinced ourselves of the truth of our model.
Think about this: if I say that I know every possible confounder that means that I have already established the true causal relation between A and B deductively. Why? Because if I am wrong and I have missed a potential confounder then the correlation is spurious and the test meaningless.
So, what I am saying is really the following:
“I am certain that I have excluded every potential C because I have isolated every possible C and tested them. Thus if there is a relationship between A and B then it must be causal.”
The critic would then say:
“Well, how do you know that you have isolated every possible C?”
To which I would have to reply:
“Because I know that my model is correct and thus by process of negation I know of all the possible C’s which are incorrect.”
But then the critic would say:
“Why then are you undertaking the test if you already know your model is correct?”
To which I would have to respond:
“Because I want to test the correlation between A and B and assure myself that it is causal.”
And the critic would then point out:
“But that’s silly because by claiming that you know every possible C you are already claiming that your theory is correct by the negation of every other possible theory.”
Finally, I would throw my hands up and say:
“Okay, fine. Maybe I don’t know every possible C. But at least I tried all the ones I do know so that I can at least be sure that they are not causing the correlation and so I can say that, given the evidence, I think that the most probable causal relation between the variables is due to A and B.”
What this proves, beyond a shadow of a doubt, is that, as Syll says, econometrics cannot work in isolation. It must deploy deductive theories to the data it handles. This is important because some people who do econometrics work don’t, in fact, use deductive models at all.
It also proves something else that, while Syll mentioned it with reference to Keynes’ critique of econometrics toward the end, I don’t think he emphasised enough: it implies that all relevant C’s must be data points that we can use — i.e. all relevant confounders must be quantifiable data points.
This, I think, is an absolutely key problem with using econometrics for economic analysis. Anyone that follows the markets and the economy on a day-by-day basis knows the importance of non-quantifiable relationships on its trajectory. With reference to recent events, for example, how do we measure Bernanke’s move to taper the QE program back in June; and, in turn, how do we measure his flip-flop on this back in September? These events have been all people in the markets have been talking about over the past few months, yet they cannot be represented as data points and thus cannot be tested as confounding variables (or as causal variables, for that matter).
Another problem that Syll briefly raises but I don’t think places enough emphasis on is the supposed solution to confounding variables by selecting different ‘populations’ — that is, different ‘control groups’. The problem here is the assumption that what economists are looking for are some sort of iron laws that apply across different groups homogeneously. But this is not, of course, true; good economists do not believe in such iron laws.
Take the example of the multiplier. Imagine that we are trying to estimate a multiplier across, say, five different control groups. Should it bother us if we get entirely different readings? Of course not. We would just conclude that the different groups have different multipliers and that we could aggregate them to get the average.
This is how good economics is done. It’s not about establishing iron laws, but rather trends at any given moment in time. This is why the application of catch-all models — which is what Syll is implicitly dealing with — is just the wrong approach. It’s just a waste of time; a search for a Holy Grail that simply doesn’t exist.
This is where the critique of econometrics overlaps with the critique of economic modelling more generally: this is simply not what we should be doing as economists. Trying to build a Holy Grail — one that gives oneself some sort of immortal knowledge of The Economy — and then continuously testing this against ever new data-sets (which are potentially as infinite as time itself) is a complete and utter waste of time. As Keynes said in a similar context (I’m paraphrasing slightly),
The labour involved is enormous, but it is a nightmare to live with.
It is the tendency of economists to try to build their Holy Grail models that leads them to defend what is an indefensible discipline — econometrics — and it is not until the structure of the profession has changed sufficiently to move away from the former that the latter will cease. And with that I leave the reader with a positive rather than a negative quote from Keynes.
The object of our analysis is, not to provide a machine, or method of blind manipulation, which will furnish an infallible answer, but to provide ourselves with an organised and orderly method of thinking out particular problems.