Seven Guidelines for Better Forecasting

Nice summary by longtime colleague and arch argument mapper Tim van Gelder. “The pivotal element here obviously is Track, i.e. measure predictive accuracy using a proper scoring rule.” If “ACERA” sounds familiar, it’s because they were part of our team when we were DAGGRE: they ran several experiments on and in parallel to the site.

Tim van Gelder

“I come not to praise forecasters but to bury them.”  With these unsubtle words, Barry Ritholz opens an entertaining piece in the Washington Post, expressing a widely held view about forecasting in difficult domains such as geopolitics or financial markets.  The view is that nobody is any good at it, or if anyone is, they can’t be reliably identified.  This hard-line skepticism has seemed warranted by the persistent failure of active fund managers to statistically outperform dart-throwing monkeys, or the research by Philip Tetlock showing that geopolitical experts do scarcely better than random, and worse than the simplest statistical methods.

More recent research on a range of fronts – notably, by the Good Judgement Project, but also by less well-known groups such as Scicast and ACERA/CEBRA here at Melbourne University – has suggested that a better view is what might be termed “tempered optimism” about expert judgement forecasting. This new attitude acknowledges that forecasting challenges will always fall on…

View original post 443 more words

Accuracy Contest: First round questions

The first round of questions has been selected for the new accuracy contest. Forecasts on these questions from November 7, 2014, through December 6, 2014, have their market scores calculated and added to a person’s “portfolio.” The best portfolios at a time shortly after March 7, 2015, will win big prizes.

Continue reading

Candy Guessing

Our school had a candy guessing contest for Hallowe’en.  There were three Jars of Unusual Shape, and various sizes.

108 Candies

Jar 1: 108 Candies

141 Candies

Jar 2: 141 Candies

259 Candies

Jar 3: 259 Candies

The spirit of Francis Galton demanded that I look at the data.  Candy guessing, like measuring temperature, is a classic case where averaging multiple readings from different sensors is expected to do very well.  Was the crowd wise?  Yes.

  • The unweighted average beat:
    • 67% of guessers on Jar 1
    • 78% of guessers on Jar 2
    • 97% of guessers on Jar 3, and
    • 97% of guessers overall
  • The median beat:
    • 89% of guessers on Jar 1
    • 83% of guessers on Jar 2
    • 78% of guessers on Jar 3, and
    • 97% of guessers overall

Only one person beat the unweighted average, and two others were close. There were 36 people who guessed all three jars (and one anonymous blogger who guessed only one jar and was excluded from the analysis). The top guesser had an overall error of 9%, while the unweighted average had an overall error of 11%. Two other guessers came close, with an average error of 12%. The worst guessers had overall error rates greater than 100%, with the worst being 193% too high.

The unweighted average was never the best on a single jar — though on Jar 3 it was only off by 1.  (The guesser on Jar 3 was exactly correct.)

The measure I used was the overall Average Absolute %Error.  The individual rankings change slightly If instead we use Absolute Average %Error, but the main result holds.

US Flu Forecast: Exploring links between national and regional level seasonal characteristics

For the flu forecasting challenge (https://scicast.org/flu) participants are required to predict several flu season characteristics, at national and at regional levels (10 HHS regions). For some of the required quantities  such as peak percentage influenza-like illness (ILI), and total seasonal ILI count  one may argue that national level values have some relationship with the regional level ones. Or, in other words participants may be led to believe that national level statistics can be obtained from regional level ones.

Continue reading

Boo! Shadow forecasts on SciCast

In time for Hallowe’en, we’ve added Shadow Forecasts and other features to help show the awesome power of combo.

  1. Shadowy: when linked forecasts affect a question, we show “Shadow Forecasts” in the history and the trend graph.legend
  2. Pointy: the trend graph now shows one point per forecast instead of just the nightly snapshot.
  3. Chatty: Comment while you forecast.  (But not while you drive.)

Read on for news about upcoming prizes.

Continue reading

Q&A with SciCaster Julie J.C.H. Ryan

SciCasters represent a variety of communities – academics, professionals, enthusiasts, even students. Find out how one professor built SciCast into her curriculum – and led students by example.

Julie JCH Ryan

Meet SciCaster Julie J.C.H. Ryan, Associate Professor, Engineering Management and Systems Engineering, George Washington University.

Q: Why SciCast in the classroom?

I was intrigued by the potential and explored several alternatives with the George Mason folks.  I decided to use SciCast as a practical learning exercise for a tech forecasting course that I was teaching in the spring.  I provide opportunities for students to learn through guided experiences.  I integrate a lot of exercises in my classes so that students are engaged in active learning through incremental explorations of the material.

Continue reading