Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

How do we decide what to do in any given situation? We need to work out the payoffs and costs of actions – what we will gain and what it might cost us. But what is the exact nature of the mechanism in the brain which weighs up these payoffs and costs?

Rafal Bogacz and Mortiz Möller have unified two existing models describing what the basal ganglia learns and how it does it. 

We strongly suspect that the mechanism for evaluating the payoffs and costs must lie in the basal ganglia, a set of structures underneath the cortex which is related to the initiation and inhibition of movements. It is thought that the possible payoffs and costs of a particular action are ‘encoded’ here by connections from the rest of the brain  to two distinct populations of brain cells (neurons). These neurons are affected by the ‘dopaminergic signal’ which carries information about the motivational state. So, the neurons in the ‘Go’ pathway are excited by dopamine, and the neurons in the ‘No-Go’ pathway are inhibited by it. It is dopamine which changes the balance between the pathways and inclines us either to take a particular action or to avoid it.

For example, say the opportunity arises to pick an apple from a tree. Clearly, the payoff would be the nutrients available from eating the apple. The costs would be both the effort of climbing the tree and the risk associated with it. Nutrients are only really deemed valuable, however, if we’re hungry. If we are hungry, the payoff will be weighted more than the costs. If we’re not hungry, the costs will be weighted more than the payoff – which shows how motivation affects our decision-making.

But how exactly does the brain work out what the payoffs and costs of particular actions would be? How does it form the two sets of connections (those that form the ‘Go’ and ‘No-Go’ pathways)? This paper suggests that payoffs and costs are learned through  reward prediction errors. So, to go back to our example, the first time you see an apple, you may overestimate its value. Next time you see an apple, you will have adjusted your estimation of its value, based on previous experience. Similarly, you tweak your estimation of how much effort it will cost you to get the apple. The surprise you feel when your prediction is not accurate is crucial to trigger learning.

The weights of the neurons in the ‘Go’ and ‘No-Go’ pathways are modified differently depending on what the brain has learned from repeated reward prediction. Brain plasticity comes into play as the brain learns from what happens in various situations. Without this knowledge, we would have a hard time trying to make good decisions.

This paper also reveals how important it is that the experiences of the payoff and cost associated with a particular action typically take place at different moments in time. You might find that the effort needed to pick the apple is more than you expected; seconds later, you might discover that the apple tastes even better than you thought it would. The brain is receiving two distinct pieces of information, and learns about positive and negative consequences by taking advantage of the fact that payoffs and costs seldom happen simultaneously (if they did, they would cancel each other out and there would be no learning). 

Rather than proposing a new model to describe how the basal ganglia ‘learns’, this paper unifies two existing models describing what the basal ganglia learns and how it does it. The first model suggests that the basal ganglia acquires reasons to approach, and reasons to avoid, frequently arising opportunities. The payoffs and costs are weighed up according to motivational state. The second model sets out plasticity rules for the basal ganglia: how the learning takes place, biologically. But how does the brain then make use of what is learned? Our new research puts these two models and theories together, providing a flexible model of learning and decision-making which takes into account the motivational state as well as the learned representations of the payoffs and costs.

This article by Jacqueline Pumphrey is an accessible summary of Möller & Bogacz 2019.

Similar stories

New insights into the effect of exposure to dim light in the evening on the biology of the sleep-wake cycle

A new study has revealed more about how exposure to dim light in the evening affects circadian health. The findings emphasise the need to optimise our artificial light exposure if we are to avoid shifting our biological clocks.

Blood lipoprotein levels linked to future risk of amyotrophic lateral sclerosis

Greater understanding of the role of lipoproteins could support screening and efforts to develop treatments.

International study finds insomnia, anxiety and depression very prevalent during first phase of COVID-19 pandemic

Researchers are recommending public health interventions to reduce the long-term adverse outcomes associated with chronic insomnia and mental health problems.

Alexander Davies wins top UKRI Future Leaders Fellowship

Alex is one of eight Oxford University academics who have been awarded significant financial funding from the UKRI Future Leaders Fellowships Scheme

New study on link between autoimmunity and pain

Patients with autoantibodies which target neuronal proteins can have pain as an under-recognised clinical manifestation.