Bayesianism and Non-Ergodicity in Economics


The atomic hypothesis which has worked so splendidly in Physics breaks down in Psychics. We are faced at every turn with the problems of Organic Unity, of Discreteness, of Discontinuity – the whole is not equal to the sum of the parts, comparisons of quantity fails us, small changes produce large effects, the assumptions of a uniform and homogeneous continuum are not satisfied. Thus the results of Mathematical Psychics turn out to be derivative, not fundamental, indexes, not measurements, first approximations at the best; and fallible indexes, dubious approximations at that, with much doubt added as to what, if anything, they are indexes or approximations of.

— John Maynard Keynes

Bayesianism is rather irritating because it allows adherents to try to avoid the Post-Keynesian criticisms regarding the heterogeneous nature of historical date which leads to its non-ergodic nature and the consequent problems with fundamental uncertainty. Because the Post-Keynesian critiques are usually aimed at frequentist interpretations of probability they often appear to be superficially overcome when arguing with a Bayesian. This, however, is categorically not the case.

For the past few days I’ve been trying to find a rather “clean” simple critique of Bayesianism that could be applied from a Post-Keynesian perspective. Now I think that I have found such a critique.

A Bayesian named Andrew Gelman has written up a summary of the criticisms thrown by their detractors and asked his colleagues to respond. The most important criticism that Gelman raises from a Post-Keynesian perspective involves the selection of “priors”. In Bayesian statistics “priors” are prior statistical distributions.

Sticking to an example I’ve used before, let’s say that I am interested in the probability, P, that a woman will call me in the morning between the hours of 9am and 11am. Now, since I am beginning my experiment I have literally no idea what the probability that a woman will call me tomorrow as I have no experimental data. The somewhat arbitrary probability that I then cook up will be called my “prior”. Gelman, adopting the voice of a critic (i.e. me), puts this as such:

Where do prior distributions come from, anyway? I don’t trust them and I see no reason to recommend that other people do, just so that I can have the warm feeling of philosophical coherence. To put it another way, why should I believe your subjective prior? If I really believed it, then I could just feed you some data and ask you for your subjective posterior. That would save me a lot of effort! (p447)

Just to clarify a “posterior” is the probability that is assigned when evidence is inputted. Anyway, Gelman’s version of the criticism seems to me a rather weak version and not nearly what I will be saying when I move to a Post-Keynesian criticism of this method if it is to be applied in economics. But it elicited a fairly clear response from Joseph Kadane. He wrote:

Why should I believe your subjective prior?” I don’t think you should. It is my responsibility as an author to explain why I chose the likelihood and prior that I did. If you find my reasons compelling, you may decide that your prior and likelihood would be sufficiently close to mine that it is worth your while to read my papers. If not, perhaps not. (p455)

So what Kadane is saying is that, to go back to my example, when we assign the first prior probability as to whether a woman will call me tomorrow morning we should make an argument and if someone else doesn’t like this argument they should throw my paper in the bin. The prior assigned, however, will always be in some sense arbitrary in that it will not be formed, as a posterior would, on the basis of data.

Now, here’s where the Post-Keynesian critique comes in. In economics we deal with heterogeneous historical data that is non-ergodic. Another way of putting this is that such data is composed of complex and unique events. An interest rate hike in 1928 is very different from an interest rate hike in 1979. The future, you see, does not mirror the past when we are talking about historical time.

Let’s go back to our example. What a Post-Keynesian economist is interested in is whether a woman will call tomorrow morning, not the probability that a woman will call on any given morning. But, of course, all we can then do is posit an argument to form a prior and you can, as Hadane says, accept or reject it. Great. That is what Post-Keynesians do. They lay out an argument. And everything stands or falls on that alone.

For Post-Keynesians there is no interest in positing a prior and then waiting for data to update the argument because the argument, by design, only works once. Post-Keynesian arguments are, in a sense, disposable. They are thrown out as historical time unfolds and new ones are constructed. The only manner in which to do this is through induction and the application of a skill-set that one acquires through one’s career. This is also, by the way, how historians and others like lawyers work.

The idea that you can find one True model that you then update with posteriors over and over again is wrong simply because the nature of the data is non-ergodic. To exaggerate slightly, but not much, there is a new argument for every new dawn. It is by wrestling with the changing nature of the economy that we come to understand it. Any other method is doomed to failure.

About pilkingtonphil

Philip Pilkington is a macroeconomist and investment professional. Writing about all things macro and investment. Views my own.You can follow him on Twitter at @philippilk.
This entry was posted in Statistics and Probability. Bookmark the permalink.

5 Responses to Bayesianism and Non-Ergodicity in Economics

  1. Nice blogs on probability and economics Phil. Have you ever read any Popper on the propensity interpretation? Here’s a recent introduction with a lot of references to the Popper-relevant literature by Popper’s student and later co-author David Miller.

  2. Matheus Grasselli says:

    In your “will a woman call me tomorrow between 9am and 11am” example, you seem to think that after a Bayesian comes up with a prior for this event, the only thing left to do is sit tight and wait until 11:01am to see whether the call takes place or not. In other words, in your view, the only relevant evidence is the occurrence (or not) of the event our poor Bayesian is trying to predict.

    But the thing is, this is not at all what Bayesian statistics is about. The point is to keep revising the prior with OTHER evidence, not necessarily the occurrence of what you are trying to predict.

    For example, not knowing anything about you, I might say that the odds of a woman calling tomorrow is 1/10 (admittedly low, but these days people don’t get many actual calls anymore). Instead of doing nothing and just waiting until tomorrow, as a good Bayesian, what I should do is go about looking for more evidence. For example, as you suggested yourself (and this is what jolted me into thinking this is a teachable moment), I might ask you if you are married. If you say yes, I immediately revise by prior upwards by quite a bit. If you say no, I might leave it as is, because that’s not very informative after all, or maybe revise it down a bit. The revised probability is now a “posterior”, even though the actual event is still in the future, which becomes the new prior until some further evidence shows up.

    Next I might ask if you have any appointments with a woman, and revise my new prior accordingly. After that I can ask to borrow your address book and make a phone call to each female name on it. To each woman I ask whether they plan to call you tomorrow or not, and every time the answer is yes I revise the prior upwards by a lot, and each time the answer is no I revise it downwards by a bit (notice there’s no assumption of symmetry here either). The probabilities after asking each question are the posteriors for that round and the prior for the next round.

    After I’m done asking all my questions I might decide that I had enough evidence and I’m ready to place a bet and wait until the event unfolds. The odds that I used for the bet are a combination of my now distant initial prior and all the evidence I collected by arduous questioning. Often they are more accurate than when I began, but not always, because you might have instructed all the women you know to lie to me. But most of the time this is a pretty good way to form an understanding of the world around you. Namely, form an opinion (belief, prior, probability, same thing) about something and go around asking questions (reading books and newspapers, writing down models, looking at data).

    Oh, and in the entire process there was no need for anything to be ergodic, or for the events to be repeatable (we are still talking about one phone call tomorrow).

    • I understand that. It was obvious to me when you made effectively the same comment regarding the election of Rand Paul. The problem remains the same: the numerical estimates you give me are still totally arbitrary. This is what I was getting at in the previous post:

      The jump from what Clark Glymour calls non-numerical “grades of belief” to what he calls numerical “degrees of belief” strikes me as totally arbitrary and, ultimately, pretentious. What we are doing is converting a relative non-numerical judgement into a numerical judgement and this gives an illusion of precision. The precision, however, is fake.

      We are giving people the impression that we have engaged in some statistical calculation but we have done no such thing. As I wrote in the above linked post:


      “Okay, so how do we come up with this numerical estimate that transforms what Glymour calls a non-numerical “grade of belief” into a properly numerical “degree of belief” between 0 and 1? Simple. We imagine that we are given the opportunity to bet on the outcome. Given such an opportunity will then force us to “show our cards” as it were and assign a properly numerical degree of belief to some potential event.

      Let’s take a real example that I used in comments of my last post: what are the chances that a woman will call me tomorrow morning between 9am and 11am and can I assign a numerical probability to this? I would say that I cannot. “Ah,” the Bayesian will say, “but you can. We will just offer you a series of bets and eventually you will take one and from there we will be able to figure out your numerical degree of belief in probabilistic terms!”

      This is a similar process to a game teenage boys play. They come up with a disgusting or dangerous act and then ask their friend how much money they would want to do it. Through a sort of bargaining process they arrive at the amount that the person in question would undertake this act for. They then discuss amongst themselves the relative price put by each on said act.

      I think this is a silly load of old nonsense. The assumption here is that locked inside my head somewhere — in my unconscious mind, presumably — is a numerical degree of belief that I assign to the probability of an event happening. Now, I am not consciously aware of it, but the process of considering possible wagers brings it out into the open.

      Why do I think that this is nonsense? Because I do not believe there is such a fixed degree of belief with a numerical value sealed into my skull. Rather I think that the wager I eventually accept will be largely arbitrary and subject to any number of different variables; from my mood, to the manner in which the wagers are posed, to the way the person looks proposing the wager (casinos don’t hire attractive women for nothing…).

      Back to my example: what are the chances that a woman will call me tomorrow morning between 9am and 11am? Well, not insignificant because I am supposed to be meeting a woman tomorrow morning at 11.30am. Can I give this a numerical estimate? Well, it certainly would not be 0.95. Nor would it be 0.0095. But to ask me to be any more accurate would be, in my opinion, an absurd undertaking. And if you convinced me to gamble on it the wager I would be willing to accept would be extraordinarily arbitrary.”

  3. Matheus Grasselli says:

    The first think is to clarify what is NOT a reason to use numbers in general and probabilities in particular. Namely the betting/Dutch book example you gave. When I (or any Bayesian) say that the meaning of the assigned probabilities are the odds that one would place on a bet, that’s NOT a reason to assign those probabilities in the first place. It’s just the operational meaning, to distinguish it from that assigned by frequentists, namely the limit of the frequencies of a given event when you repeat an experiment infinitely many times. So one can joke about Bayesians placing bets on everything they see, but that’s definitely not WHY they come up with the probabilities to begin with (even though it’s an accurate operational definition). Likewise, it’s not the fear of being systematically robbed by the world that compels them to use those probabilities in practice, because the actual instances of real bets are limited to a small number of gamblers, and the circumstances of the Dutch book example are so extreme that they are almost never encountered in practice.

    The real reason to use number in general (and probabilities in particular) is that they help to make sense of the world around us. And this is because reality is a hybrid mix of phenomena, some exhibiting statistical regularity (e.g ergodicity, but there are many other types of regularity), some not. It might even be the case that MOST phenomena do not exhibit any regularity at all, but the fact is that some do, and very often (I’d even say almost all the time) we need to consider both types of phenomena together.

    The problem with frequentism is that it can ONLY deal with phenomena that are highly regular. The problem with rejecting numbers altogether is that it doesn’t incorporate the (undeniable) existence of regularity. The beauty of Bayesianism is that it allows you to incorporate both.

    The woman-calling example can be entirely laid out without numbers, exclusively with “degrees of belief” that get weaker or stronger as evidence comes forward. Think of a colour gradation, or the position of a dial, or don’t think about any metaphor at all, just stick with the difference between weak and strong belief and sense of it getting weaker or stronger. You might choose to assign numbers and they’ll be arbitrary as long as all the evidence you are gathering is also qualitative (e.g yes or no answers) and does not have any statistical regularity.

    The purpose of assigning numbers is to deal with evidence that is quantitative in nature and comes from phenomena with some degree of regularity. It is only when you have to combine both that you need to use numbers, and in this case the numbers will be anchored by the bits of data that are quantitative.

    Take Nate Silver’s election-prediction model. It incorporates both quantitative data with a lot of regularity (polls, etc) and a tons of qualitative stuff, for which he has to come up with numbers just so that they can be incorporated in the model. Even though the priors that he assigns to these bits of the model are not based on statistical regularities, they get meaningfully integrated with others bits of data that do, and in the end produce a probabilistic estimate for the outcome of the election that is far from being pretentious, arbitrary, or meaningless.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s