The hidden agent of decision making

How the causal structure of the environment influences learning across development ⎮3 min 30 sec read

Like Comment
Read the paper

Learning from positive and negative experiences helps us achieve our goals throughout our lives. What we learn from our experiences might change depending on how much control we think we have in our learning environment. For example, a student who finds herself constantly failing pop quizzes in class might believe that her bad grades are due to her poor study habits, leading her to study harder to improve her grades. Alternatively, the student might believe that her grades are due to a harsh teacher, and that no matter how hard she studies, she will not be able to improve her quiz scores. Believing that her poor grades are outside her control may lead her to stick to her current study habits, rather than altering them in response to negative feedback. Recent work has shown that adults adjust how they learn from positive and negative outcomes, based on how much control they think they have, taking into account how likely it is that their actions causally influenced the outcomes they experienced. But what about kids and teenagers? Can they also use their beliefs about the causal structure of the environment to adjust their learning?

To address this question, we had 90 participants between the ages of 7 and 25 complete a learning task that was developed and previously used by our colleagues in a sample of adults. Participants were told they were miners digging for gold in the wild west. On every trial, participants were shown two different gold mines and had to select the one where they wanted to dig for gold. One mine had more gold than the other — digging at this ‘good’ mine would be more likely to yield gold, whereas digging at the other mine would be more likely to yield rocks. Based on the outcomes of their choices, participants could learn which mine was more likely to give them gold.

Critically, participants mined for gold in three different territories, which were inhabited by three different hidden agents. In millionaire territory, a nice millionaire would sometimes replace all the rocks in both mines with gold. If the millionaire intervened on a given trial, the participant would always receive gold, regardless of where they chose to dig. In robber territory, a mean robber would sometimes replace all the gold in the mines with rocks, meaning participants would receive rocks regardless of their choice. And in sheriff territory, a sneaky sheriff mixed around the gold and the rocks, meaning participants would have an equal probability of receiving gold or rocks, regardless of their choice of mine. Importantly, while participants were told which territory they were in, they were not told whether a hidden agent intervened on any given trial.

The learning task was set in three different territories inhabited by a hidden agent. Elements of this image were designed by Freepik ( and are licensed for personal use. Originally published in the research article (

Manipulating the structure of each environment enabled us to examine whether participants’ causal beliefs influenced how they learned from positive and negative feedback. Returning to our example, the robber in our task is akin to a mean teacher — if a learner believes the negative feedback they receive is caused by someone else rather than as a result of their own choices, then rationally, they should not use this feedback to learn which choice to make. Instead, a learner in robber territory should learn more from positive outcomes, which can only be attributed to their own actions.

We compared several computational models of learning to examine how learning changed across development. The choices of adults and adolescents were best fit by learning models with a causal inference component that rationally discounted outcomes that could be attributed to a hidden agent. Though kids demonstrated explicit awareness of the causal structure of the task, their choices were best captured by a simpler learning model that did not take into account the structure of the environment.

Our findings suggested that with increasing age, individuals begin to use their beliefs about the causal structure of the environment to learn how to make good choices. In addition, they revealed a dissociation between children’s understanding of the extent of their control over the environment and their use of that information to guide learning. Further research into how learning mechanisms change with age may inform strategies that can be used to promote the best learning outcomes at different stages of development.

To learn more, read our free and open access article: The rational use of causal inference to guide reinforcement learning strengthens with agepublished by npj Science of Learning

Blog article written by Alexandra Cohen and Kate Nussenbaum.  

Alexandra Cohen

Postdoctoral Researcher, New York University

No comments yet.