on 28 November 2012
Mr Silver clearly knows what he is talking about, but I'm less sure he knows how to talk about it. I assume he set out to write a chatty, non-challenging book, but the result is light on substance and structure.
The Nobel prize-winning physicist Niels Bohr famously said 'Prediction is very difficult, especially if it's about the future'. This pretty much sums up the first half of the book. Yes, the detail about the financial crisis, weather forecasting, earthquakes etc is mildly interesting, but in relation to prediction, you will be wading through a lot of noise to extract the signal ('human nature makes us over-confident predictors', 'without either good theory or good empirical data, you may as well just guess','the most confident pundits are usually the worst' etc).
The substance of the book comes in twenty pages in the middle, where Silver introduces Bayesian logic (I learnt in maths classes at school when I was fourteen so it wasn't new to me, and it doesn't need 200 pages of build up). The best section is where Silver contrasts Bayesian logic to Fisherian logic. Fisher created the maths that is used almost universally in medical and social science research to prove the efficacy of a treatment or theory. Silver explains how flawed this maths is - which is presumably why two thirds of the positive findings claimed in medical journals cannot be replicated. This is pretty heady stuff.
Silver claims that the second half of the book is about how to make predictions better. It is mostly more examples of failure, this time in chess, investment, climate and terrorism, with a few asides that might be considered signals ('testing is good', 'groups/markets tend to make better predictions than individuals'). The exception is the section on poker, which delivers the strongest message in the book: good gamblers think in probabilities (rather than dead certs) - when these probabilities diverge from the odds on offer by a suitable margin, they may place a bet. Bad poker players lose a lot more than good poker players make. The best is the enemy of the good...
Of course, the point of the book is that there is no silver bullet - good prediction requires detail, nuance, hard work, honesty and humility. It would be wrong to expect a check list for success at the end, and naturally, there isn't one. Even so, you are left with a craving for clarity.
'The Signal and the Noise' is a pleasant enough read, but it is mostly anecdote. Rather ironically, you are left to sort out the signal from noise yourself.
on 8 December 2012
This is a book about prediction and the use of statistics to forecast future events such as earthquakes and the outcome of elections. When it's good it's a lucid and enjoyable read which makes some important points about the art of prediction, with the chapters on political punditry and economic forecasting stand out as especially good. Unfortunately this is let down by a number of problems. These include the interminable and really quite tedious chapters on poker, baseball and chess (I really don't know why the chess one is in the book at all), and the inclusion of a number of serious errors and misconceptions in the chapter on epidemiology. This last is a subject that I think I have some knowledge of, and it's disturbing to see straightforward and important factual errors - the definition of the basic reproductive rate used is badly wrong, for example (if anyone's interested the correct definition is that it is the number of new infections produced by a single infectious host *in a population of completely susceptible hosts*), and the interpretation is also wrong (it's not correct that any disease with basic reproductive rate >1 will go on to infect all susceptible hosts in the population). These are not nit-picking little errors - it's the equivalent of getting the definition of interest rate seriously wrong in a discussion of economics. These are fundamental concepts and the errors tell us that the author did not properly understand the subject that he's writing about.
The use of mathematical models in epidemiology is also portrayed in a misleading way: no-one would ever expect a simple SIR model to produce useful predictions about disease spread in a real-world population, rather these simple models are used in an exploratory way to help us understand the theoretically possible behaviours of such diseases (something which is mentioned at the end of the chapter as a bit of an afterthought). It would be better if this chapter looked at some examples where modelling has been useful in the management of disease, such as the fairly dramatic story of how models of foot-and-mouth disease spread in the UK were used to persuade the government to change policy and bring the army in to help deal with the disease during the 2001 outbreak.
A final comment is that the portrayal of Bayesian statistics as the only way to analyse data and the use of straw-man arguments to ridicule "frequentist" statistics and bash Fisher is getting really tired. I use both Bayesian and "frequentist" stats in my research and I am able to understand that both are useful under different circumstances and both have advantages and disadvantages. I suspect that if Silver had some experience with working in experimental science, for example, he would have a better appreciation for when conventional statistical analysis is a useful tool.
on 25 November 2012
Silver has some good ideas, and he is to be commended for scruplously footnoting his references, but there are some mistakes (the "cows would rate this" was from an S&P analyst,not Moody's) and he utilises heuristics he criticises elsewhere (lazily claiming the industrial revolution happened, just like that, in 1775 with the excuse "it is a nice round number").
My two main criticisms for non Americna readers is that it is quite US centric (I don't care about baseball, and the general moneyball story is impossible to avoid) and the main philosophical stuff (which was most useful and ineresting to me) makes up a small portion of the book, the majority with various examples where he makes the same arguments with interview of different people that are somewhat non-questioning.
He gives some useful examples throughout the book, covering meteorology, earthquakes, transmision of viruses, but it still feels as if it could have been cut. The stuff on Bayes is interetsing but really skates over the issue of how you come up with a Bayesian prior when you can't iteratively improve them because you do not have many data points. Given the time he spends looking at the financial crisis, this is a flaw as it reduces the "wow, Bayes is really useful" impact when it cannot offer that much resolution to the problem of predicting economic and financial crises, the key predictive failure he cites.
Even so, as a way of getting people to think a bit more deeply about what it is to make a prediction and how to know if it was well constructed, and how to integrate concepts of epistemology, it is a useful introductory book.
on 6 July 2014
If you are really into measuring signals in noise this isn't for you. If you want to be told, page after page, how brilliant the author is at, for example, football (US) statistics then you will find something of interest. Don't expect to find any useful information on regression; Bayes; predictor-corrector; Kalman; entropy; ... or just about anything to do with prediction.
on 15 December 2012
This is a fascinating exploration into statistical modelling. Okay that may not be the most enticing reason to read a book you have ever been given but here's the deal. The author takes an approachable, narrative and witty approach to examining the successes and more often failures of predictions based on the sort of statistics that get bandied about on the news channels 24-7. He offers insight into the causes of the financial crisis and shows why we sleepwalked into an avoidable catastrophe. He explains how far you can trust a weather forecast (about five days) and what to take into consideration when using it. He analyses subjects as diverse as baseball scouting, pandemic scares, earthquake prediction and why Deep Blue beat Garry Kasparov at chess. More importantly he presents the subject with a minimum of maths, with all you need to know explained in simple terms. You wont walk away from this book with the ability to do stats, but you'll be better equipped to know how to treat them.
I am going to hazard my own prediction on this book. If you buy it, will you like it? That depends on how much you know about statistics - if you know a lot, then this book is unlikely to tell you anything new. And if you already know about Bayes' Theorem then the likelihood is that you will not find anything new in this book.
If however you have never heard of Bayes' Theorem and you don't have a background or training in statistics (like me)then the likelihood of you enjoying the book is greater. However, though, I liked it, in the sense that overall I found it interesting, I only gave it three stars. Why?
First of all the book is quite uneven on the topics it discusses. The chapters on the failure of ratings agencies to predict the sub prime crash, on why it is so difficult statistically to predict earthquakes, or why weather forecasting has improved (and incidentally why private weather channels deliberately forecast a higher probability of rain than is actually merited) are good chapters. Others are only so-so, and feel like they have been padded out, like the chapter on economic forecasting, which does not end up saying anything that you don't already know (economists are rubbish at forecasting). And there are chapters that are just plain dull - like the ones on baseball and poker for instance. Some chapters rate five stars for interest, others three stars and some just the one star. It's a very variable reading experience.
Second, the underlying range of ideas, despite the eclecticism of the topics discussed, seems to be quite narrow. The book seems to be saying that the difficulty of predicting any given event, from earthquakes to the outcome of elections, depends on the availability of data. With earthquakes, the big ones, the ones we really need to worry about, these are hard to predict because we have observed or recorded so few of them. The problem is too few data. But baseball matches produce lots of data, and are therefore the outcomes are easier to predict.
The book is valuable in that it seeks - rightly so - to counteract the tub-thumping certainties of media pundits. It is an important insight to realize that all predictions are about degrees of wrongness. But somehow these insights do not cohere into a sustained thesis. Instead the book reads like a collection of loosely organised chatty discussions.
I wanted to like this book because I admired the author's recent triumph with his successful prediction of the outcome of the 2012 US presidential election. But this book might have been improved with the chapters on baseball and poker being pulled out and the rest of the book edited down. Then it would have served a good introduction to the perils and pitfalls of forecasting. Then I would probably have given it five stars. But as it currently stands, I feel I can only give it three stars.
on 3 August 2014
Just because we’ve got unprecedented amounts of data available to us, it doesn’t mean we can always make good predictions. Weather forecasting and hurricane predictions have improved over the latest ten years (the science and the modelling are understood; it was the sheer amount of number-crunching that was the problem). Other successes have taken place, in areas like spotting future baseball stars, chess programming (defined rule-sets) and political polling. But earthquake prediction is still impossible, and economic and financial analysis are still hopeless (too ill-defined, and they change as you observe them), despite the unreasonable and unreliable confidence with which they’re made. The book covers all these topics, along with climate change, terrorism and poker. It tells you why most TV pundits prefer controversy to accuracy (it gets them more notice), so are usually best ignored. Why weather forecasts longer than 6 days ahead are totally worthless. And why no-one predicted the crash (it wasn’t only that the mortage-default probability model was faulty, but also that no-one was prepared to say so). In a manner familiar from celeb-presented TV documentaries, the author goes to visit various eminences, and notes their homes and hairstyles, while they tell him elementary things that he could’ve found out from their books or website. (Such ‘interviews’ are mostly, I guess, about the author indicating his own importance, in that the celebs are willing to give time to a person of equal fame.) There’s the obligatory paean to the free-market, in the midst of documenting its faults; and a plea to think more widely about terrorism probabilities (WTC7?). Overall, it’s an absorbing read – rigorous without being too technical, clear about its wide-ranging topics and usefully illustrated with graphs.
A few months ago, driving south on the A1, there was a sign informing me that the A1 was closed south of the A607. I had no way of knowing where that was, and, having already spent an hour heading due East from the M6, as a precaution turned back west toward the M1, to take a detour which cost me a further hour. I subsequently found out that the A607, at the time I had seen the sign, was so far south as to render the information about as useful as one telling me the Trans-Canada Highway was closed. This is the difference between information, which can tell you a lot without being particularly useful, and knowledge, which can help you make a decision.
Now more than ever we are bombarded with information, and somehow we have to make sense of it in order to make sense of life and make decisions based on what we predict the future will bring.
As Nate Silver, statistics wunderkind, prince of predictions, tells us in his enlightening book, in order to do so we need to be able to differentiate between signal and noise, and act on what the signal is telling us. "We think we want information," he says, "when we really want knowledge."
If only the Highways Agency could get that into their heads, I'll be a lot better served on the A1 in future.
Following a brief preamble taking in Caesar's failure to heed the right warnings, the wars emanating from Luther's ninety-five theses and the lack of availability of a theory which would have helped avert 9/11, Silver's first target is the wideranging failure to predict the financial meltdown of 2007-8. In particular the failure of the ratings agencies to understand the weakness of their models, based as they were on data formed almost exclusively from a boom period. Charles Wheelan, in Naked Statistics, makes a similar point about other models in operation at the time, singling out JP Morgan in particular.
He then takes in a number of other fields in which predictions have too often been based on noise rather than signal. In discussing television pundits he invokes Philip Tetlock's characterisation of people as either foxes, who know many things, and hedgehogs, who know one big thing, in which the foxes, believing in a plethora of little ideas, and taking multiple approaches to problems, transpire to be the better forecasters. In discussing weather forecasting he evokes amazement that, given the complexity of the system with which forecasters are dealing, in which chaos theory rules, forecasters are now able to get so much right. And in exposing the failures of earthquake forecasting he discusses the problems of overfitting.
Having discussed the problems he introduces the principles of Bayesian theory and how they can help with filtering out noise. One of his illustrations of the principles is through an account of the "Poker bubble". My own personal takeaway from that chapter was "Don't play poker: you'll lose". The game, as with playing the stock markets, he later reveals, is based upon the presence of plenty of suckers who don't know what they're doing. Without them, there's no money to be made, and sooner or later, chances are you'll end up a sucker.
Throughout, Silver provides very clear explanations of his subject matter: Efficient Market Hypothesis, heuristics, agent-based models and many others. He is very good on the subject of climate change, where he emphasises the significance of the data, points out that many so-called "sceptics" don't actually question the fact of climate change, just what the true consequences will be, and cites William Nordhaus's argument that it is precisely the uncertainty of climate forecasting that is the reason for action. He also makes a far better fist of differentiating accuracy and precision than did the aforementioned Charles Wheelan.
Overall, then, a useful, instructive and entertaining read. Unfortunately, in a book about forecasting, the word "forecasted" crops up a lot, a construct I'd accept from children but for grown-ups "forecast" suffices as past tense and past participle. If baseball isn't your thing, you have to cling very hard to academic detachment in chapter 3 in order to stay engaged. And contrary to what Silver says, Jack Bauer did not prevent a nuke from being detonated in LA: one of the five he was chasing in Series 6 destroyed Valencia.
The Signal and the Noise is a book about statistics, designed to be read by those of us who are not statisticians. It was written by Nate Silver, something of a minor celebrity in US political circles after he used the methodology of statistics to correctly predict the outcome of two US presidential elections in a row (and do so to a startling degree of detail). In this book, he attempts to lay out some knowledge of statistics for the layman, and does so by wrapping it inside discussions of what statistics tell us about various interesting topics - weather prediction, earthquake monitoring, poker, and the global finance crisis all make an appearance.
Each of the chapters is well presented, and impeccably referenced, but also comes across in an accessible, breezy style. The text has clearly been organised and written with the casual, non-mathematics-centric reader in mind. The discussions of using statistics and probability in real world examples brings to mind the similarly punchy Freakonomics. The examples are interesting, draw the reader in, and say intriguing things about the world that we inhabit - and are occasionally quirky (my favourite, the discovery that The Weather Channel deliberately overpredict the chance of rain on any given day, to avoid angering punters when a sunny day turns to rain instead).
Drawbacks: It's a little US-centric, with some time devoted to baseball statistics, and the mechancs of betting on basketball, which have less impact outside of a country where those are central sports. That said, it's not too bad, and most of the chapters are easily readable outside a US-centric context.
The text is also not an in-depth statistics book. If you're already interested in the field, you probably know all of this already - it's a solid primer for those of us with an interest in probability, levels of risk, and formal logic, but I don't imagine it offers serious students of that topic anything they wouldn't find in an introductory textbook - but I suspect the target audience in this case are those in need of an accessible primer on the topic.
It's a clever, intellectually rigorous text, with a wide breadth (if perhaps less depth), and a good read, too.
on 25 July 2013
Nate Silver has shot to fame as the oracular figure who decoded political polling data into plain English and successfully predicted the US election. His debut book brings him back down to earth, using familiar examples as diverse as moneyball and warfare to demonstrate the sore lack of and need for better prediction in our lives, and the path to improvement through critical thinking and Bayesian reasoning.
Each chapter uses a particular area of prediction to teach broader lessons. The book opens with great momentum, using the financial crisis as a set of unambiguous examples of how not to make predictions before drilling into the all-too-human reasons that political commentators make poor election forecasts. There are good lessons here about how the need to feel confident and a single-minded focus on a few issues can lead one astray; he turns back to the financial crisis to emphasise the same failures there.
It's not all about the human factors, though, and Silver then turns to "moneyball" - statistics-based sports recruitment - to provide an overview of the more technical aspects of the art of prognostication. The idea of a predictive model is well articulated and applied to common-sense issues with surprising complications. With the reader warmed up, he spends several chapters digging into the fundimental reasons why level-headed and critically thinking scientists are unable to predict earthquakes. Some things - weather, disease, tectonic plates - are inherently challenging to forecast for interesting reasons, and he is equally quick to emphasise the technical traps that researchers can fall into in building their models.
The heart of the book, however, is Bayesian reasoning: the idea that we should take new predictions as adjustments to whatever our existing prediction said, as a sort of rolling improvement to our models. As a simple illustration, a test result indicating that one may have a rare disease should be combined with the low probability that one had the disease before the test results were in. Even if the test is 95% accurate, if the disease only affects one in a million people then the odds are far, far lower than 95% that one actually has the condition.
This is the tool Silver uses in the latter half of the book to show the way to better predictions, while still taking the time to illuminate other forecasting challenges. Whether it's poker or chess, the stockmarket or the battlefield, making a good model and refining it with new data is the key to victory. He lays out how the problems rise in these fields, be it a new raft of human frailties or the hefty challenge of trying to beat the "wisdom of the crowds", sets out how these failures in prediction can be capitalised by good agents or bad, and suggests Bayesian solutions.
A chapter on climate change in a book aimed at at those in big business has a huge potential to be a train wreck but Silver manages to weave a fairly acceptable course through the problem. This chapter acts to draw the book together, forcing together issues of complex models, noisy new data, and incentives to mislead, with Bayesian reasoning as the knight in shining armour. The overall theme is that climate models are difficult to make for fundamental reasons, and the warming consensus that has come out from those models has stood up to new results - despite the claims of think tanks who wish it otherwise.
This section has annoyed commentators on both sides of the issue. Silver manages to make good points without falling into the many huge rhetorical traps that the denialist movement has laid in any writer's path, but he's never particularly strong on the issue either. I liked the unspoken conclusion that less-confident predictions - 95% confidence rather than 99%, say - are more resilient to contradictory data in a Bayesian world, and Silver does not make false equivalencies and is unambiguous in supporting global warming. However this is not a strong introduction into climate science, or a real challenge to many of the incorrect claims made by denialists.
Truth be told this is a deliberate stylistic choice and potential issue throughout the book. Silver avoids bringing in controversies in the fundimental results that feed forecasts, except where it is directly relevant to a chapter's lesson. In the section on the financial crisis, human incentives are raised as a source of bias, but the humans responsible are hardly taken to task. If you want to find out about the failures of reasoning that permitted the 9/11 attacks, you'll have to read elsewhere. (Donald Rumsfeld appears but only as a lead into the "unknown unknowns" idea.) The implications of Scott Armstrong's work with the notoriously vociferous anti-climate-change Heartland Institute are left for the reader to find out about on their own.
This will variously come across as refreshingly expedient, frustratingly wishy-washy, focussed or cowardly depending on your reading preferences and ideological views. Consider yourself forewarned and take the book on its own terms.
The Signal and the Noise is certainly cleanly written and well-structured. Silver's introduction sets the book up as a toolbox, first outlining the failures of prediction and their causes before moving onto the successes and the processes that enable them, but in truth he allows the book to digress around the broader themes raised in each chapter, be it the problems and benefits of the "wisdom of the crowds" or the failure to properly, quantitatively account for the uncertainty in the prediction. These digressions are brief and enlightening, and echo back and forth between the chapters to make a more cohesive whole.
With the aforementioned caveat this is a superb route into the whole issue of modeling and forecasting. It's accessible, clearly written, technically sound and meticulously reasoned. It's recommended as reading on a difficult subject, although it's probably not going to prove to be the definitive work.
(If you want an primer to thinking about statistics before you dig into this I strongly recommend Darrell Huff's "How to Lie with Statistics". It's inexpensive, funny, brief, and makes a good companion piece.)