Systematic Detection of Housing Bubble – Part I: Z-Scoring and It’s Problems

In a recent piece in Newsweek that got some attention, I made the case that the United States is currently experiencing a housing bubble. The next logical question is obvious: are other countries? After all, the 2008 meltdown was a global crisis; the US was not alone in its housing bubble.

In order to try to detect housing bubbles ideally we would like some sort of systematic framework that we can deploy. The problems with using this approach when it comes to hosuing bubbles, however, are not widely appreciated. In this post I am going to highlight some of the core problems here and in the next post a will propose a remedy.

The most obvious approach to building a systematic housing bubble detection is to get access to a widely published time series for multiple countries and then subject it to standard-scoring (Z-scoring). Standard-scoring works by telling us how unusual a datapoint is relative to the other datapoints in the time series.

So long as we are comfortable with manipulating large datasets, this should be fairly straightfoward to do. OECD publish solid databases on a variety of housing price metrics for most of the developed world countries. This data mostly goes back to the 1990s, so the sample should – in theory – large enough to subject to standard-scoring. We will use real house prices as they tend to use the two most internationally comparable datasets (house prices and CPI).

Here is what the housing markets for major developed countries look like when we apply this technique.

At first glance this looks good: if my argument in the Newsweek article is correct, then we know that the US is experiencing a housing bubble and that seems to show up here as a 2.5 standard deviation event. But to take this at face value would be enormously misleading. We can see why if we take some of these time series raw and plot them.

Let us look briefly at three seperate countries with very different housing market dynamics. We will take Germany, which has not experienced a bubble and a crash since our data begins; Ireland, which experienced possibly the largest housing bubble in history in 2006-08; and the US, which had a large housing bubble in 2006-08 and which we suspect to be in a bubble again.

Let us comment country-by-country and compare these more intuitive readings with the more abstract standard-scoring readings above.

Germany’s housing market looks somewhat overvalued relative to history. But it might be a stretch calling it a full-on bubble. Yet if we turn back to the standard-score, Germany looked like the worst bubble in our entire dataset – the country’s housing market is experiecing a 3-standard deviation event (something that has a less than 0.13% chance of happening!). The reason that this is happening should be obvious to those who understand the mathematics of standard-scoring: since Germany’s housing market has been so ‘boring’ for the past 30 years, even a secular uptrend in prices will stand out as a highly unusual event.

Ireland’s housing market went nuts in the last cycle. We can see the ‘biggest bubble in history’ clearly: real house prices increased around 300% from their 1990s levels! Yet if we look at where Ireland is today it appears that they are once again in a bubble – albeit one not quite as bad as last time. Yet if we turn back to our standard-scores, Ireland is near the bottom of the list with only a 1.2 standard deviation event. Once again, if we understand the mathematics we will readily know what is going on: because Ireland has had such a wild ride in the past, datapoints that should register as highly unusual do not.

The United States is somewhere in the middle. We see the housing bubble clearly in the data. It is, as I wrote in the Newsweek piece, as bad as the last bubble. And lo and behold, this register perfectly well on our standard-scoring framework as a 2.5 standard deviation event.

These examples allow us to draw a general inference about using standard-scoring for housing bubble detection: standard-scoring only really works when the volatility – i.e. the standard deviation – of the datasets under examination are relatively similar.

But, to paraphrase Tolstoy, since all housing markets are bubbly in their own specific way the standard-scoring framework – applied naively – fails completely when trying to systematically detect bubbles in various housing markets. Lessons to be learned here about deploying standard-scoring generally, I should think!


About pilkingtonphil

Philip Pilkington is a macroeconomist and investment professional. Writing about all things macro and investment. Views my own.You can follow him on Twitter at @philippilk.
This entry was posted in Asset Market Research. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s