Bowl of marshmellows

The marshmallow test, and the crisis of replicability

Good things come to those who wait” has been said. As has been “the early bird catches the worm!” Maintain self control or carpe diem?

Have you heard of ‘the marshmallow test’?

The marshmallow test is a series of studies conducted by psychology professor Walter Mischel in the 1960’s and 1970’s, at the time at Stanford. In these studies a child was offered the choice between one small reward (e.g. a marshmallow, cookie or a pretzel) immediately, or a larger reward (e.g. two) if they waited for a short period while the researcher left the room for around 15 minutes.

From follow-up studies Mischel concluded that children willing to wait longer for a bigger reward tended to have better life outcomes when measured against educational attainment, body mass index and other life measures. Self control and patience correlates to positive life outcomes. A logical and plausible conclusion explored in his book of the same name.

Walter Mischel - The Marshmallow Test


In recent years there has been a replication of many psychological studies. Research validity is all about replicability, the ability of a scientific experiment or trial to be repeated to obtain a consistent result. ‘The Reproducibility Project’ involved over 270 contributors earning authorship on the summary report, and a further 86 volunteers to conduct 100 replications of experimental and correlational studies published in three psychology journals. The Reproducibility Project: Psychology began in November 2011, finished primary data collection in December 2014, and published a summary of the results in August 2015. The project was coordinated by the Center for Open Science.

The project led by Brian Nosek from the University of Virginia, would be the first big systematic attempt to answer questions that have been vexing psychologists for years, if not decades.

What proportion of results in their field are reliable? (More >).


The results of the project have raised alarm in the psychology community.

“Assessing whether the replication and the original experiment yielded the same result according to several criteria, they find that about one-third to one-half of the original findings were also observed in the replication study.” (More >)


This has been described as a ‘crisis of replicability’ from some and acceptance from others as indication of an introspective revolution suspected as required for a long time.

“Crisis of replicability is one term that psychological scientists use for the current introspective phase we are in — I argue instead that we are going through a revolution analogous to a political revolution. Revolution 2.0 is an uprising focused on how we should be doing science now (i.e., in a 2.0 world). The precipitating events of the revolution have already been well-documented: failures to replicate, questionable research practices, fraud, etc. And the fact that none of these events is new to our field has also been well-documented.” 

A Short (Personal) Future History of Revolution 2.0


Let’s take a look at a recent replication of the marshmallow test. The original results were based on studies with fewer than 90 children all enrolled in a preschool on Stanford University’s campus. In a replication of the original research published by Tyler W. Watts, Greg J. Duncan and Haonan Quan in 2018 used a much larger sample (N=900) more representative of the general population in terms of race, ethnicity, and parents’ education (More >).

Essentially the results from study repeat offered limited support for the findings of the original research. Rather, capacity to wait for a second marshmallow (or other larger reward) was more so about the child’s socio-economic background, and had a bigger influence on outcomes.

Children unable to delay gratification more often came from backgrounds where such patience was not rewarded, i.e. households in which treats were occasional due to financial instability and other factors, and not indulging when the opportunity arose might not be repeated for some time. Whereas more financial households were able to promise and deliver on future treat and other opportunities, and such situations were likely less rare. There was less need to fear delayed gratification.

Other research in recent years has also indicated the class dimension of the marshmallow test. Harvard economist Sendhil Mullainathan and the Princeton behavioral scientist Eldar Shafir wrote a book in 2013, Scarcity: Why Having Too Little Means So Much, that detailed how poverty can lead people to opt for short-term rather than long-term rewards. Children from households headed by parents who are better educated and earn more money, typically find it easier to delay gratification.

This all highlights a few things.

  1. Don’t always believe what the ‘expert’ says. Even the most charismatic and reputable psychologist, professor or other ‘expert’ may be spinning a yarn on a flimsy theory or a biased agenda. Even if it all sounds plausible, compelling and engaging. Retain a healthy dose of skepticism. Question the evidence in their report, article or presentation. Have a look at the full report, the methodology, sample size and representativeness. Check the questionnaire structure and order of questions for any bias. What hypothesis were they trying to prove, or ignore and any hidden agenda? Where is their confirmation bias? Is the research replicable and representative?
  2. Seek replicability in the simple surveys and more complex research you do. Organisations, academics and others are too often constrained methodologically, through naivety, incompetency or financially to do robust and replicable research. If it is important to get the research right, take care, use an expert when necessary and invest and take time to do it correctly and accurately.
  3. Correlation does not equate to causality. Just because the data for one variable (e.g. self-control) is correlated with another (e.g. life outcomes), does not equate to causality. There are likely other factors (e.g. socio-economic) that are the true driving variables. Look deeply at data analysis.
  4. Eat the marshmallow (or other indulgence) if you want. Such willingness to be the early bird eating the chocolate or seizing the day, has little influence on your life outcomes. Likely other factors, such as socio-economic, fear of future uncertainty and psychological forces have a much greater influence on your level of self control and that of our children. Self control is more likely environmental.
Self control: Dan Ariely at TEDxDuke


Share this: