4 min read

Diversity in group reasoning

Diversity in group reasoning
Photo by Annie Spratt / Unsplash

One popular concern about the information age has been the notion of filter bubbles. Put forward in a 2011 TED talk, the idea of the filter bubble was that personalized news feeds would cause people to live in distinct realities, something that could ultimately fragment society and throw a wrench in the works of democracy. A recent study on collective forecasting claims that groups of like-minded individuals do indeed make similar judgments on the basis of their online research, in comparison to more diverse groups.

First things first: the evidence for filter bubbles affecting news consumption is not strong, to say the least (Zuiderveen et al., 2016; Bruns, 2019). Real-world news diets remain diverse and while some people selectively attend partisan news sources, they very rarely avoid (or are successful at avoiding) other sources. Although these results offer no guarantees for the future, the filter bubble problem has thus far remained either a theoretical or a marginal phenomenon.

Yet the effects of different informational environments on collective-decision making may be more subtle. While people may generally consume more or less the same news, they way they discuss that news with others can lead to belief-biased processing, especially if such discussion takes place in an environment with low viewpoint diversity. It is certainly conceivable that small biases in personalized search and news are augmented in so-called echo chambers. To get a good idea of this, one would need to track the whole process of information gathering, reasoning, social deliberation and decision-making.

This is exactly what a team of Berlin-based media scientists set out to do, using an interesting experimental design (Pescetelli et al., 2021). In brief, they gathered a large pool of subjects and surveyed all sorts of personal information, such as sexual preference, geographical location, political position, etc. On the basis of this data, they clustered the different participants, creating distinct categories of like-minded individuals. This, in turn, allowed them to create subgroups for their experiment of varying size and diversity.

Each participant was given eight real-world binary forecasting problems, to which they could give a scaled response. The problems were about thorny subjects, suitable for critical analysis, that were taken from the Hybrid Forecasting Competition. Importantly, the problems did not have a definitive answer at the time of the study, but the accuracy of the predictions could be ascertained later. For example, participants were given the following question:

Before 8 September 2018, will Poland, Estonia, Latvia, or Lithuania accuse Russia of intervening militarily in its territory without permission?

Participants went through three stages: first they had 30 seconds to decide on an answer from the top of their head, then they had 90 seconds to search for additional information online to give a revised forecast and finally they had 270 seconds to discuss the matter with their group members, using chat. At the end of this third phase, each participants gave a private forecast and a group consensus forecast, which had to be the same for each group member.

The researchers were interested in how group size and diversity would affect forecasting performance. They found that forecasts improved after group deliberation and that this was especially strong in diverse groups. Aggregating different groups further improved accuracy: having multiple small groups of about five people deliberate and then averaging their responses gave better results than having a single large group of about 25 people deliberate.

Interestingly, the authors also found that participants did not have correlating beliefs about the problems when going into the study, even not if they were like-minded according to the survey. Initial predictions were varied and only started to become similar after the online search phase, just before the social deliberation.

In itself, this is not surprising. The forecasting problems referred to Hungary’s KDNP, the European Union’s Article 7.1 and the Pashtun Loya Jirga. Making predictions about these things without access to a search engine is pretty hard. However, the increase in belief correlation was stronger for like-minded individuals, meaning there is also something about their search strategy, their go-to news sources or the ways in which the draw inferences that led them to a particular conclusion. Note that 90 seconds is a really short time to research something — participants were not able to critically examine, compare or explore different sources, which may have augmented their habitual way of seeking and evaluating information.

The like-minded participants then took these correlated beliefs into group deliberation, where they found agreement more quickly than diverse groups did if the groups were small, but took longer than diverse groups did when the groups were large. On the basis of the study results, this disparity is not easy to explain. However, in both cases the diverse teams came up with more accurate predictions.

All of this is still no strong evidence for a filter bubble, as it’s not necessarily personalized news or search that caused the initial convergence of beliefs for like-minded people. In fact, as there was no non-web searching control group, it’s difficult to say anything about the role of the online behaviour. However, it does show that seemingly independent (shallow) research is not always that independent after all and this may come at the price of lower accuracy when making predictions, even in subsequent group reasoning.

When applied to critical thinking as critical talking, this study indicates that ideally, groups of reasoners would be diverse. However, in practice this may be unfeasible or even undesirable — if you want to improve the critical thinking abilities of a group of professionals, replacing half of them will often not be the preferred route. Moreover, the type of ‘trait diversity’ used in this study does not always map onto ‘informational diversity’ (De Oliveira & Nisbett, 2018). This is where the other findings come in: creating smaller ‘ reasoning squads’ that engage in social deliberation independently from each other and subequently aggregating their conclusions may off-set some of the limitations of low overall diversity and can be a good way to boost critical reasoning in a larger organizations.


Bruns, A. (2019). Filter bubble. Internet Policy Review, 8(4).

De Oliveira, S., & Nisbett, R. E. (2018). Demographically diverse crowds are typically not much wiser than homogeneous crowds. Proceedings of the National Academy of Sciences, 115(9), 2066-2071.

Pescetelli, N., Rutherford, A., & Rahwan, I. (2021). Modularity and composite diversity affect the collective gathering of information online. Nature Communications, 12(1), 1-10.

Zuiderveen Borgesius, F., Trilling, D., Möller, J., Bodó, B., De Vreese, C. H., & Helberger, N. (2016). Should we worry about filter bubbles?. Internet Policy Review. Journal on Internet Regulation, 5(1).