Paper on experts, non-experts, and forecasting accuracy

This paper is super cool. I have not read it in full yet — just came across it:

When it comes to forecasting future research results, who knows what? We have attempted to provide systematic evidence within one particular setting, taking advantage of forecasts by a large sample of experts and of non-experts regarding 15 different experimental treatments. Within this context, forecasts carry a surprising amount of information, especially if the forecasts are aggregated to form a wisdom-of-crowds forecast. This information, however, does not reside with traditional experts. Forecasters with higher vertical, horizontal, or contextual expertise do not make more accurate forecasts. Furthermore, forecasts by academic experts are more informative than forecasts by non-experts only if a measure of accuracy in ‘levels’ is used. If forecasts are used just to rank treatments, non-experts, including even an easy-to-recruit online sample, do just as well as experts. Thus, the answer to the who part of the question above is intertwined with the answer to the what part. Even if one restricts oneself to the accuracy in ‘levels’ (absolute error and squared error), one can select non-experts with accuracy meeting, or exceeding, that of the experts. Therefore, the information about future experimental results is more widely distributed than one may have thought. We presented also a simple model to organize the evidence on expertise. The current results, while just a first step, already draw out a number of implications for increasing accuracy of research forecasts. Clearly, asking for multiple opinions has high returns. Further, traditional experts may not necessarily offer a more precise forecast than a well-motivated audience, and the latter is easier to reach. One can then screen the non-experts based on measures of effort, confidence, and accuracy on a trial question.

“Levels” basically means how well something does, whereas “order” just means rank things from best to worst. If all you need to know is what’s better and what’s worse, experts aren’t that great. But if you need to know how much better, they outperform most people.

What else predicts accuracy?

The revealed-ability variable plays an important role: the prediction without it does not achieve the same accuracy. Thus, especially if it is possible to observe the track record, even with a very short history (in this case we use just one forecast), it is possible to identify subsamples of non-expert forecasters with accuracy that matches or surpasses the accuracy of expert samples.

And, who do we think will be good at forecasting?

Figure 13 plots the beliefs of the 208 experts compared with the actual accuracy for the specified group of forecasters. The first cell indicates that the experts are on average accurate about themselves, expecting to get about 6 forecasts ‘correct’, in line with the realization. Furthermore, as the second cell shows, the experts expect other academics to do on average somewhat better than them, at 6.7 correct forecasts. Thus, this sample of experts does not display evidence of overconfidence (Healy and Moore, 2008), possibly because the experts were being particularly cautious not to fall into such a trap. The key cells are the next ones, on the expected accuracy for other groups. The experts expect the 15 most-cited experts to be somewhat more accurate when the opposite is true. They also expect experts with a psychology PhD to be more accurate where, once again, the data points if anything in the other direction. They also expect that PhD students would be significantly less accurate, whereas the PhD students match the experts in accuracy. The experts also expect that the PhD students with expertise in behavioral economics would do better, which we do not find. The experts correctly anticipate that MBA students and MTurk workers would do worse. However, they think that having experienced the task among the MTurkers would raise noticeably the accuracy, counterfactually.

I see this as broadly consistent with Tetlock: some people do better than others when it comes to forecasting (empirical judgment). Expertise does seem to help somewhat — but with caveats. And the people who do best are not always the ones you’d expect.

Leave a comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s