Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

How do we decide what to do in any given situation? We need to work out the payoffs and costs of actions – what we will gain and what it might cost us. But what is the exact nature of the mechanism in the brain which weighs up these payoffs and costs?

Rafal Bogacz and Mortiz Möller have unified two existing models describing what the basal ganglia learns and how it does it. 

We strongly suspect that the mechanism for evaluating the payoffs and costs must lie in the basal ganglia, a set of structures underneath the cortex which is related to the initiation and inhibition of movements. It is thought that the possible payoffs and costs of a particular action are ‘encoded’ here by connections from the rest of the brain  to two distinct populations of brain cells (neurons). These neurons are affected by the ‘dopaminergic signal’ which carries information about the motivational state. So, the neurons in the ‘Go’ pathway are excited by dopamine, and the neurons in the ‘No-Go’ pathway are inhibited by it. It is dopamine which changes the balance between the pathways and inclines us either to take a particular action or to avoid it.

For example, say the opportunity arises to pick an apple from a tree. Clearly, the payoff would be the nutrients available from eating the apple. The costs would be both the effort of climbing the tree and the risk associated with it. Nutrients are only really deemed valuable, however, if we’re hungry. If we are hungry, the payoff will be weighted more than the costs. If we’re not hungry, the costs will be weighted more than the payoff – which shows how motivation affects our decision-making.

But how exactly does the brain work out what the payoffs and costs of particular actions would be? How does it form the two sets of connections (those that form the ‘Go’ and ‘No-Go’ pathways)? This paper suggests that payoffs and costs are learned through  reward prediction errors. So, to go back to our example, the first time you see an apple, you may overestimate its value. Next time you see an apple, you will have adjusted your estimation of its value, based on previous experience. Similarly, you tweak your estimation of how much effort it will cost you to get the apple. The surprise you feel when your prediction is not accurate is crucial to trigger learning.

The weights of the neurons in the ‘Go’ and ‘No-Go’ pathways are modified differently depending on what the brain has learned from repeated reward prediction. Brain plasticity comes into play as the brain learns from what happens in various situations. Without this knowledge, we would have a hard time trying to make good decisions.

This paper also reveals how important it is that the experiences of the payoff and cost associated with a particular action typically take place at different moments in time. You might find that the effort needed to pick the apple is more than you expected; seconds later, you might discover that the apple tastes even better than you thought it would. The brain is receiving two distinct pieces of information, and learns about positive and negative consequences by taking advantage of the fact that payoffs and costs seldom happen simultaneously (if they did, they would cancel each other out and there would be no learning). 

Rather than proposing a new model to describe how the basal ganglia ‘learns’, this paper unifies two existing models describing what the basal ganglia learns and how it does it. The first model suggests that the basal ganglia acquires reasons to approach, and reasons to avoid, frequently arising opportunities. The payoffs and costs are weighed up according to motivational state. The second model sets out plasticity rules for the basal ganglia: how the learning takes place, biologically. But how does the brain then make use of what is learned? Our new research puts these two models and theories together, providing a flexible model of learning and decision-making which takes into account the motivational state as well as the learned representations of the payoffs and costs.

This article by Jacqueline Pumphrey is an accessible summary of Möller & Bogacz 2019.

Similar stories

Newborn brain scans clarify how some diseases develop

Newborn brain scans from the Developing Human Connectome Project are now available online in large-scale open-source project, clarifying how some diseases develop.

New consortium to uncover mechanisms of neuropathic pain

Professor David Bennett is leading a new national research consortium to investigate neuropathic pain.

NDCN Thomas Willis Day Prize Winners

Our annual Thomas Willis Day celebrates the work of our Department over the previous year.

Heidi Johansen-Berg elected Fellow of the Academy of Medical Sciences

Congratulations to Heidi Johansen-Berg, one of 11 University of Oxford biomedical and health scientists that the Academy of Medical Sciences has elected to its fellowship.

Research shows how the brain reorganises old memories when new ones are made

Researchers have discovered that the arrangement of existing memories in the brain is altered when we embed new memories