Feb 11, 2023 4 min read prediction markets

Market potential

One of the many challenges of teaching critical thinking is that you need to prepare learners for novel problems, with solutions that cannot be obtained by following fixed procedures. Or, as Daniel Willingham (2008) put it:

[Critical thinking has] three key features: effectiveness, novelty, and self-direction. Critical thinking is effective in that it avoids common pitfalls, such as seeing only one side of an issue, discounting new evidence that disconfirms your ideas, reasoning from passion rather than logic, failing to support statements with evidence, and so on. Critical thinking is novel in that you don’t simply remember a solution or a situation that is similar enough to guide you. For example, solving a complex but familiar physics problem by applying a multi-step algorithm isn’t critical thinking because you are really drawing on memory to solve the problem. But devising a new algorithm is critical thinking. Critical thinking is self-directed in that the thinker must be calling the shots: We wouldn’t give a student much credit for critical thinking if the teacher were prompting each step he took.

So what do you teach? How do you (transparently) assess performance on such an ill-defined task? And how do you know your teaching method is actually effective?

One of my personal interests in this regard is forecasting. Forecasting has been used to investigate the quality of reasoning in experimental settings, as in a study on the importance of informational diversity in group reasoning (Pescetelli, Rutherford & Rahwan, 2021). It offers a way out of the paradox that questions should be open-ended but answers should still be verifiable – you just check whether the forecast came true at a later stage. Yet there's a problem there, too – a critical thinker might be justified in stating something is unlikely to happen, yet that doesn't mean it won't happen! It'd be better if we could get a quantification of how likely an event is to check reasoning quality.

To this end, I have been exploring the prediction markets listed on Manifold Markets. This website is like a stock exchange for predictions – users can spend play money to buy shares for specific predictions and make gains if their predictions come true. The thing that distinguishes it from comparable initiatives and from estimate aggregators like Good Judgment Open, is that you can create a market on anything you'd like on Manifold. Besides all sorts of self-referential and in-crowd prediction markets, this has also led to many markets that make predictions about current events. For example, the below market tracks overall sentiment on whether the UK will re-join the European Union before 2032.

Prediction markets are predicated on the idea that there's wisdom in the crowd. If stocks in a specific market are overvalued, you can gain play money by betting against it, and vice versa. Besides markets that tracks the odds for specific events, there are also markets that estimate a specific number or a choice from multiple options.

Now if prediction markets work well, you could use them as benchmarks against which you could compare reasoner performance. For example, at the time of this writing, the users of Manifold estimate that there's a 15% chance of the UK re-joining the EU before 2032. Taken at face value, that means that any individual reasoner who thinks about this particular topic and arrives at a 15% chance has done really well.

But do the markets at Manifold work well? I am just not sure. The user base is not huge and I do not have the impression it is informationally diverse at this time. In fact, I am not convinced people are necessarily leveraging informational diversity even if it's there. Intuitive estimates are nice, but as far as I know forecasts improve with reasoning and deliberation, neither of which is happening much on Manifold. The incentive is to 'beat the market', which is not directly an accuracy motive. Yet the platform is pretty self-confident about its accuracy (operationalized as Brier Score) or otherwise pessimistic about this specific other forecasting initiative:

I will be keeping an eye out on the performance of Manifold – if it does well, it would be cool to have students in a critical thinking class place online bets after deliberation, or to test whether using specific reasoning exercises lead to more or less convergence with Manifold markets.

References

Pescetelli, N., Rutherford, A., & Rahwan, I. (2021). Modularity and composite diversity affect the collective gathering of information online. Nature Communications, 12(1), 1-10.

Willingham, D. T. (2008). Critical thinking: Why is it so hard to teach?. Arts Education Policy Review, 109(4), 21-32.

References

You might also like...

Argument mapping with LLMs

A framework for critical thinking about GenAI

The dangers of AI-mediated cognitive offloading

Epistemic vigilance in the age of AI

The critical dance with GenAI