Did you know that if you are male and eat beans every Tuesday morning at exactly 8.30am you are more likely to marry a supermodel? No. That’s not true. I just made that up. But I hear of statistical studies in the media that sound only slightly less ridiculous all the time. Often these have to do with diet, sexual psychology or… economics.
All three of these spheres are, of course, the sorts of things you find dealt with in the religious and mythological texts of old. This is because they are key psychological aspects of how we as humans form our identities. The manner in which we eat, what would today be called our sexual orientation/preferences (it should be noted that this was treated very differently prior to the 19th century…) and how we organise our societies are things that constitute key components of our personal identities.
These are slippery aspects of existence. Because they are effectively moral issues we as humans need to feel that they are constant throughout time and space. But anyone with any historical or cultural understanding knows that these shift this way and that over time. Diet fads fluctuate rapidly, while cuisines of various types go in and out of fashion. Sexual norms change from decade-to-decade (homosexuality was considered a mental disorder in the West until 1973!). And if you need to be told that fads in economic policies are historically contingent and reflective of the politics of the day then you probably shouldn’t be reading this blog.
Science dreams of reducing all of this to Reason. It has since at least the 19th century when religion fell by the wayside and science tried to fill the void. In every era there is some hocus pocus thrown up wearing the clothes of the scientist and handing down Moral Truths: about how we should eat, how we should conduct ourselves sexually and how we should run our societies. In the past 40 or so years these questions have increasingly fallen to social science disciplines (and dieticians) who use statistical techniques.
The problem is that the nature of the material that they are dealing with is not suited to the techniques they are using. The nature of the material is that it changes and evolves through time. We cannot anticipate these changes to any large extent either. Doing so would be like trying to predict what style of dress will be popular in 2080. This leads to the statistical literature generally being a mess. Indeed, the literature itself seems to evolve through time together with the data and the ideological fads that emerge and die off. I increasingly think that the statistical literature is coming to mirror the trends themselves but with a lag.
The latest attempt to impose some order on this chaos is the practice of so-called ‘meta-regression’. The idea is to take all of the studies showing all of the contradictory results, aggregate them and run regressions on them. In sciences where the material is suited to statistical study — that is, in sciences where causality does not change and evolve through time — this is quite sensible. But where the material doesn’t accommodate this such analysis likely only amplifies the underlying problems.
Take, for example, the following paper ‘Wheat From Chaff: Meta-Analysis As Quantitative Literature Review‘ by T.D. Stanley. In the paper Stanley says that we should use meta-regressions to do our literature reviews. The problem is that this assumes that the regressions on which we run the regressions have some underlying validity in the first place: that is, they can give us information about certain causal laws that will hold into the future.
Some of the examples that Stanley gives where meta-analyses have been applied in the past seem reasonable, others do not.
There are many examples where meta-analyses clarified a controversial area of research. For example, meta-analysis has been used to establish a connection between exposure to TV violence and aggressive behavior (Paik and Comstock, 1994), the efficacy of coronary bypass surgery (Held, Yusuf and Furberg, 1989), the risk of secondhand smoke (He et al., 1999), and the effectiveness of spending more money on schools (Hedges, Laine and Greenwald, 1994; Krueger, 1999). (p133)
The efficacy of coronary bypass surgery seems very reasonable. We know the mechanism through which this is supposed to work. But there still arises the question of environment. I should hope, for example, that the meta-analysis is being run on people in countries with similar diets and weather that some from similar income groups. This raises an issue that we shall encounter more critically in a moment.
The risk of second-hand smoke is slightly more dubious. This, as is well-known, is not something that is particularly easy to prove. I do not know how they do these studies but I would assume that they would look for instances of lung and heart disease in non-smoking people who co-habit with smokers. Something along these lines will be a reasonable approach. Again, this is because we know the mechanism through which smoking causes these diseases and we know that this has relative constancy through time and space.
Spending money on schools is far more difficult. First of all, Stanley doesn’t say what spending more money on schools is effective for. We can only assume that it has to do with educational outcomes. Personally I believe that spending more money on schools is generally effective in this regard simply due to intuition and personal experience. But it is not quite clear that we can meaningfully test it in statistical terms, nor is it clear that we should ever make such claims except in a very general sense. The causal mechanism is not clear here. There are many ways in which this money can be spent. It is also not clear that spending money will fix problems in all schools. Some schools may have issues related to funding. But some may have issues that have little to do with this: the class background of the children who attend or the structure of the testing regime come to mind as issues that may not be related to funding. Here we are beginning to see that the causes and effects become murky. While every smoker suffers from basically the same cause and effect mechanisms, this seems less likely in the case of schools.
The study linking TV violence and aggression sounds the alarm for me. That sounds like garbage. The causal link here seems highly abstract and based on some crude mechanistic stimulus-response view of human psychology. The methodological issues also seem problematic: is this a lab experiment or is it based on survey results? Both suffer from serious problems. I also see no way to establish causation: do people with violent tendencies watch violent TV programs or vice versa? If we cannot establish causation any information we do glean from the study — even if we believe in the study itself — will be largely useless.
I could look at all the studies individually, but I — like you, dear reader — have limited time. We all need some sort of filtering system to sort sense from nonsense and what I just demonstrated above is how I tend to think about these issues; in economics, as well as when I’m reading the newspaper the above approach is how I usually deal with such issues. And I think it is pretty functional.
Anyway, back to meta-regressions. The problem with these is that they aggregate even more than the studies themselves. This is fine when we are dealing with material that is homogenous through time — that is, material where the causality is fairly stable — but it will not work where the causality is slippery. In the above examples again I would highlight the studies linking TV violence to aggression.
I have dealt with this question on here before. But let me give a practical example: that of the multiplier. Let’s say that I need to give a politician a number for the fiscal multiplier in their country. Now, many economists — assuming that causality is constant through time — would get as much time-series data as possible for the country in question and run regressions. But let’s say that some extreme event had happened in the past five years like, oh I don’t know, a financial crisis. I would think that the multiplier would likely have changed from before this crisis. Thus the question is raised whether we should estimate the multiplier using the whole time-series or using the data from after the crisis. My gut would say that we should probably use the data after the crisis but there are probably some ways to look into this in more depth.
The point is that we at least need to raise the question. But economists often do not. They aggregate, aggregate, aggregate. They choose datasets willy-nilly. They assume constant, homogenous causes. Why? Because, I think, they are more often than not already sure of what they are going to say and they use the empirical techniques to dress this up. There is a risk then that using meta-analyses will only give us a reflection of the average opinion of the economics community at any given moment in time. But these opinions are extremely prone to fads because the economics community is insular, pretentious, consensus-driven and ultimately insecure. Today NAIRU, yesterday monetarism. Tomorrow? God knows. Beans and supermodels probably wouldn’t be far off.
It never ceases to amaze me how often one sees statistical techniques whose validity relies on satisfaction of IID applied to problems where IID is implausible or impossible.
It’s almost as if people routinely use statistics without understanding it at all.
Speaking of statistical techniques, I just came across something hilarious in the usually very dry Stata documentation. I was reading about nested logit models. Newer versions of Stata apparently use a parameterization consistent with random utility maximization (RUM). The author of the documentation states, “We recommend using the RUM-consistent version of the model for new projects because it is based on a sound measure of consumer behavior.” Most people interested in a nested logit model would probably come across this quote and take it to be an actual justification for using the technique, pretty sad.
Click to access rnlogit.pdf
In medicine and particularly in the area of diet you can run into exactly the same problem as economics. You *think* you know what the causal mechanism, but you can’t actually be sure because the experiment to establish the truth would involve killing some people and saving. Probably lots of people.
There are an awful lot of epidemiological studies in the area of heart disease in particular that border on the religious, and you suffer the same issue there. The prevailing view continues and any objections are suppressed viciously. That’s why washing hands during surgery took so long to get established.
And medicine is running into the same problem as economics. Research relies on a gross simplification – that human beings are the same. Yet in reality we are all substantially different.
Now that all the easy wins from population medicine have been discovered, it is becoming a limiting factor. And of course it is very difficult to show the efficacy of a drug if you can’t establish the statistics to show it works. But if all the people in your sample group show different responses you can’t establish anything at all.
The commonalities are very clear. Over-abstraction and working purely on the abstraction regardless of its relationship to reality, and extreme suppression of any competing ideas – with no real way of constructing an experiment to resolve the situation.
I agree. Medicine has certainly left its age of innocence behind. I had to deal with neurologists a while back based on a family matter and I was struck by one thing: even though they had access to advanced imaging technology, in a lot of ways their field had not much moved on from the advances it made in the late-19th century. What I mean by that is that they are still largely catagorising symptoms and establishing patterns. They were also trying to apply statistics but this struck me as a mess. Although the neurologists were all nice, realistic people — far less hocus pocus than the economists! — they nevertheless were clearly somewhat uncomfortable when I really probed them on certain issues.
I’m coming more and more to think that Science itself has left behind it an age of innocence. There are some fields that are still very promising, like computer technology. But even here there are limits that will be reached in the near future — especially surrounding the hopes many have for AI. But I’m not sure that Science is going to perform in the future as it has over the past few hundred years. We might be entering a very weird place as a species. I keep being reminded of Nietzsche’s idea of an age of nihilism. It was staved off for a while by the advances in the sciences — and the hopes for an Explanation of Everything that they brought — but it might be coming. Anyway, perhaps a little much for a comment on an economics blog.
Despite all this, however, I would say that scientific medicine has at least made spectacular progress in many other areas since the late 19th century in ways that, say, neoclassical economics has not.
Well, they are two different fields. Marginalism is pseudoscience or secular theology or something similar.
But I think that medicine’s Golden Age ended in the 1960s or 1970s or somewhere around there. Although there is some promise that nanotechnology or genetics might provide new ground to break. But I still think they’re suffering from diminishing returns on finding causes in diseases in which they are not sure of causes. It seems to have fallen into the statistical case study bin. Which is not a good place to be.