Skip to main content

Frequently Asked Questions (FAQs) about using crowdsourced forecasting to “augment collective intelligence”

We might ask: “Will the 2022 Olympics be canceled?” One individual may think there’s an 80% chance, another a 65% chance, and another a 90% chance. The consensus would be an average of these forecasts. An individual can update their forecast at any time based on new information they may have acquired, and the consensus is constantly re-calculated to reflect the latest forecasts. Individuals are also typically asked to provide a rationale for their forecast to give decision-makers additional context for why a consensus forecast may be changing over time.

In business and government, crowdsourced forecasting has proven to address problems experienced in assessing uncertainty, where traditional prediction methods are impossible to deploy effectively. Sometimes historical data is lacking or unstructured, making predictive models difficult to construct. Experts who are hired to offer future-looking analyses may introduce bias because of particular worldviews or mis-aligned incentives. And diversity of input is usually difficult to obtain because of inter- or intra-organizational politics, restrictions, and logistics.

Research funded by the U.S. Intelligence Community (IARPA), conducted over the past 10 years, has shown that 1) people can become more accurate forecasters over time,  2) various mathematical combinations of people’s forecasts -- based on factors such as prior performance -- could increase consensus forecast accuracy by as much as 30%; and 3) actively encouraging collaboration among the best forecasters makes them even more accurate.

Organizations are using crowdsourced forecasting to better guide major strategic and operational decisions they are facing. For example, a corporate client might use crowdsourced forecasting to predict the demand for a given product in an emerging market where they don’t have dependable historical data, and decide how much of this product to produce. A government department focused on risk assessment may use it to quantify the likelihood of various operational risks from a diverse group of skill sets and expertise and use the forecasts to inform resource allocation decisions.

Various intelligence agencies around the world have deployed crowdsourced forecasting among their analyst communities. The UK Government in particular was recently featured in The Economist for their efforts. Companies across a wide array of industries like UBS (finance), AbbVie (pharmaceuticals), and Procter & Gamble (consumer products) have active forecasting efforts, along with university think tanks and non-profits.

A survey is meant to gauge opinions from people in a single instance. The results represent a snapshot in time of people’s sentiments. Surveys are also typically designed to capture what people “like” or would “like to happen” vs. crowdsourced forecasting which is designed to ask people what they think will happen, regardless of if they want it to, or not.

Unless the same survey was run weekly or more, it cannot also support the possibility for understanding the real-time sentiment of team(s) like crowdsourced forecasting can. For example, if news breaks that Russia’s ICBM program has had funding cuts, the forecast of Sarbat being finished this year may drop to 20% and policymakers can adjust their short and medium-term decisions accordingly. In contrast, a bi-annual survey run 2 months ago about Russia’s nuclear program would already be woefully out of date.

You shouldn’t, you should listen to both. The crowd consensus is a “check” against expert opinion alone. Research done on crowd forecasting has--time and time again--shown that crowds produce more reliable forecasts. This is not to say they’re always accurate, just that they’re accurate more often. Similar research has shown that individual experts tend to be less reliable in their forecasting.

That said, expert judgment and crowdsourced forecasting do not have to be at odds. Experts are needed to identify the forecast questions themselves, for example. Experts may also add vital perspective to a consensus forecast and should be included as part of a larger crowd invited to forecast.

Decision makers have identified several benefits to their decision making:

  • Seeing how the trends of a consensus forecast can give insight into whether the status quo of a situation is changing;
  • The qualitative rationales typically provided alongside a quantitative forecast are helpful in determining why ground truth may be changing;
  • Crowdsourced forecasts may be more accurate - especially where modeling is not reliable because of a lack of a robust dataset or the variability of a situation;
  • Asking everyone to quantify their judgments can highlight disagreement between cohorts of individuals. For example, you may come to understand that operational personnel believe one future will occur, vs. senior decision makers think another; and
  • Constant availability of a proven, reliable signal eliminates the need for informed assumptions or “going with your gut instinct.”

Crowdsourced forecasting is a general term for aggregating forecasts to generate a consensus forecast, but there are several different methods for accomplishing this. Prediction markets are one of those methods, eliciting probabilities is another. Today, organizations predominantly use probability elicitation as their primary method of forecasting, whereas prediction markets are used in real-money settings like cryptocurrency-based markets, and government regulated markets.

Crowdsourced forecasting has its origins in the work of a British Statistician named Sir Francis Galton, who discovered the concepts of mean, median, and other basic statistical concepts. In 1906, Galton used the median to guess almost the exact weight of an ox by soliciting guesses from a large group of people attending a fair. The crowd prediction was much more accurate than any individual’s (including the veterinarian and the farmer!). In the 1980’s, the Iowa Electronic Market was started at the University of Iowa to crowdsource predictions about Presidential Elections, and in 2011 the U.S. Intelligence Community’s IARPA funded a multi-million dollar, 4-year study to test the effectiveness of crowdsourced forecasting and identify the most optimal conditions for accuracy.

The success of a poll is usually dependent on capturing a representative demographic that is an accurate representation of the voting public, the purchasing public, etc. Often capturing that representative demographic is extremely difficult, if not impossible, thus the uneven performance of polls. Crowdsourced forecasting does not derive its value from trying to represent the view of an entire population, instead it leverages the insights from an informed group of people who are making forecasts on what will happen.

The best forecasters typically have these attributes:

  • Being actively open-minded and humble;
  • A willingness to dispassionately consider new ideas and information;
  • A collaborative spirit to leverage the sources and insights from others vs. a “go it alone” competitive mentality;
    • Dedication to regularly updating their assumptions and mental map of why they’re making the forecast they are;
  • An acknowledgement of their biases and an active effort to de-bias themselves as part of their forecasting process;
  • An ability to understand and process historical data or evidence to establish a base rate as a starting point for their views.

 

Many recipients of crowdsourced forecasts tell us one of the most useful aspects alongside the consensus forecasts is the qualitative analysis that accompanies the forecasts. Why are judgments changing about an outcome? What sources are being cited for those changes? What arguments are being employed to justify forecasts? In addition, we can filter that qualitative output by cohort: What are the most accurate forecasters about X topic saying about why they’re changing their forecasts? How does one team’s justification for a forecast vary from another’s? Minimally, this kind of analysis will challenge or confirm one’s own assumptions in their own analysis, which ultimately drives the quality of any analytic outputs.

A pundit making a prediction that “there will be a cold war between the US and China” is not supplying very decision-relevant information. Even if one can look past the vagueness of this forecast and determine whether the pundit was right or wrong, they are probably never going to have to answer for being wrong. Even senior decision-makers make forecasts without accountability, meaning they feel less pressure to be accurate.

In contrast, crowdsourced forecasting makes people record a quantitative forecast, and that forecast is scored when the answer is known. By always scoring, we develop a track record for anyone participating. For example, many are currently interested in the potential for conflict in the South China Sea. One only has to look on social media or news channels for thousands and thousands of comments and predictions about the situation. With a crowdsourced forecasting approach, we’d work to structure a falsifiable question and have people enter their forecasts and be scored after a period of time, based on ground truth. It’s clear someone with a verifiable track record is far more valuable than the pundit or anonymous commenter.

When rolling out a crowdsourced forecasting program, it’s important to gauge the culture of the community being asked to participate. Are they competitive? Collaborative? Hierarchical? Understanding the culture then drives the approach for creating trust and engagement. For example, in hierarchical organizations, we often stress the importance of teaming and collaboration, and the desire to contribute to a larger good vs. focusing on competition, where we run the risk of making people “look bad.” On a tactical level, participants are always anonymous to their peers, so the forecasting platform is often a “safe space” where people feel they can express their true beliefs.

Some crowdsourced forecasting efforts do indeed focus on specific metrics or very narrow topic areas. Others, however, are designed to inform a big-picture view of a strategic issue.

For example, let’s say we’re interested in understanding the impacts of green trends on the oil and gas industry. Before launching any forecasting questions, we would first go through an “issue decomposition” to better understand what the pivotal factors are that would cause the outcome of the issue to go in one direction or another. We’d then identify sign posts or signals that inform us which direction those individual pivotal factors are heading:

Strategic Scenario: Impact of green trends on oil & gas firm

Pivotal Factor:

Unfavorable Policy & Regulation

Pivotal Factor:

Shifting Energy Production Patterns

Pivotal Factor:

Increased Electric Vehicle Uptake

SIGNALS

  • Will the EU announce in 2021 a ban of new sales of petrol and diesel vehicles starting within the next 10 years?
  • Will the Green Party receive the most votes in the 2021 German election?
  • Will the Line 5 pipeline in Michigan successfully be shut down?
  • Will the US announce increased regulations on coal power generation in 2021?

SIGNALS

  • What percentage of US energy production will be from renewables in 2021?
  • What percentage of US energy production will be from coal in 2021?
  • Will the US install 2-gigawatts or more of new grid-level power storage batteries in 2021?
  • What is the likelihood that a manufacturer will sell commercial-grade solar panels with at least 24% efficiency in 2022?

SIGNALS

  • How many total Fast Charge (>22kW) public charging points for electric vehicles will be installed in the European Union by 31 December 2022?
  • How many total Fast Charge (>22kW) public charging points for electric vehicles will be installed in the US by 31 December 2022?
  • Will Ford deliver more than 100K F-150 electric trucks to consumers in 2022?
  • What will be the 2022 industry-wide average cost of Li-ion batteries used in battery-powered electric vehicles?

Once people start forecasting in the individual signals, we can “recompose” the ongoing results to understand the likelihood of a pivotal factor occurring, and ultimately, the outcome of the strategic issue. And of course, the various types of analysis we’ve mentioned by cohort would also be possible.

The answer is largely, again, dependent on the culture of your organization, but generically, incentivizing participation is typically a blend of strategies:

  • Clearly articulate the goals of the program and why people are being asked to spend time each week making forecasts. Who will be receiving the forecasts? How will they be used as part of the decision making process?
  • Stress participation in the program as a training opportunity, not just a one-way street;
  • Conduct offline events related to topics being forecasted. We often call these “Learning Labs” where we invite a SME to present, do Q&A, and then “live forecast” and share each other’s perspectives on why they’re forecasting what they are and if the SME’s presentation impacted their views;
  • Offer rewards for positive behaviors, not just for being the most accurate;
  • Leverage (or disable) leaderboards and other ranking mechanisms depending on culture;
  • Create a network of “ambassadors” within the organization to promote the effort at a local level;
  • Encourage the formation of cross-functional forecasting teams so people have accountability and a group to share information and collaborate with; and
  • Have regular communication from senior leadership discussing their reactions to the forecasts and stressing their sponsorship and support of the activity.