Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

How do we decide what to do in any given situation? We need to work out the payoffs and costs of actions – what we will gain and what it might cost us. But what is the exact nature of the mechanism in the brain which weighs up these payoffs and costs?

Rafal Bogacz and Mortiz Möller have unified two existing models describing what the basal ganglia learns and how it does it. 

We strongly suspect that the mechanism for evaluating the payoffs and costs must lie in the basal ganglia, a set of structures underneath the cortex which is related to the initiation and inhibition of movements. It is thought that the possible payoffs and costs of a particular action are ‘encoded’ here by connections from the rest of the brain  to two distinct populations of brain cells (neurons). These neurons are affected by the ‘dopaminergic signal’ which carries information about the motivational state. So, the neurons in the ‘Go’ pathway are excited by dopamine, and the neurons in the ‘No-Go’ pathway are inhibited by it. It is dopamine which changes the balance between the pathways and inclines us either to take a particular action or to avoid it.

For example, say the opportunity arises to pick an apple from a tree. Clearly, the payoff would be the nutrients available from eating the apple. The costs would be both the effort of climbing the tree and the risk associated with it. Nutrients are only really deemed valuable, however, if we’re hungry. If we are hungry, the payoff will be weighted more than the costs. If we’re not hungry, the costs will be weighted more than the payoff – which shows how motivation affects our decision-making.

But how exactly does the brain work out what the payoffs and costs of particular actions would be? How does it form the two sets of connections (those that form the ‘Go’ and ‘No-Go’ pathways)? This paper suggests that payoffs and costs are learned through  reward prediction errors. So, to go back to our example, the first time you see an apple, you may overestimate its value. Next time you see an apple, you will have adjusted your estimation of its value, based on previous experience. Similarly, you tweak your estimation of how much effort it will cost you to get the apple. The surprise you feel when your prediction is not accurate is crucial to trigger learning.

The weights of the neurons in the ‘Go’ and ‘No-Go’ pathways are modified differently depending on what the brain has learned from repeated reward prediction. Brain plasticity comes into play as the brain learns from what happens in various situations. Without this knowledge, we would have a hard time trying to make good decisions.

This paper also reveals how important it is that the experiences of the payoff and cost associated with a particular action typically take place at different moments in time. You might find that the effort needed to pick the apple is more than you expected; seconds later, you might discover that the apple tastes even better than you thought it would. The brain is receiving two distinct pieces of information, and learns about positive and negative consequences by taking advantage of the fact that payoffs and costs seldom happen simultaneously (if they did, they would cancel each other out and there would be no learning). 

Rather than proposing a new model to describe how the basal ganglia ‘learns’, this paper unifies two existing models describing what the basal ganglia learns and how it does it. The first model suggests that the basal ganglia acquires reasons to approach, and reasons to avoid, frequently arising opportunities. The payoffs and costs are weighed up according to motivational state. The second model sets out plasticity rules for the basal ganglia: how the learning takes place, biologically. But how does the brain then make use of what is learned? Our new research puts these two models and theories together, providing a flexible model of learning and decision-making which takes into account the motivational state as well as the learned representations of the payoffs and costs.

This article by Jacqueline Pumphrey is an accessible summary of Möller & Bogacz 2019.

Similar stories

Attention and memory deficits persist for months after recovery from mild COVID

Researchers from Oxford’s Nuffield Department of Clinical Neurosciences and Department of Experimental Psychology have shown that people who have had COVID but don’t complain of long COVID symptoms in daily life nevertheless can show degraded attention and memory for up to six to nine months.

New Academic Visitor from Nigeria

Associate Professor of Radiology, Godwin Ogbole has arrived on a six-month visit to the Nuffield Department of Clinical Neurosciences, as part of the Africa Oxford Initiative.

New spinout company: Human-Centric Drug Discovery

Human-Centric Drug Discovery is a new Oxford University spinout company from Professor Zameel Cader's lab.

Funding received for research into Motor Neuron Disease

A £210,000 donation from the Alan Davidson Foundation has been made to our Department to advance our world-leading research into Motor Neuron Disease. The funding will support a project manager to deliver an innovative research project using the genetic causes of MND to develop approaches to early diagnosis.

Research finds drug may benefit some patients hospitalised with COVID-19 pneumonia

A proof-of-concept trial involving Oxford researchers has identified a drug that may benefit some patients hospitalised with COVID-19 pneumonia.

Protein test could lead to earlier and better diagnosis of Parkinson’s

Scientists have observed the clumping of alpha-synuclein in the cerebrospinal fluid taken from people with Parkinson's. The findings offer hope that a pioneering new clinical test could be developed to diagnose Parkinson's correctly in its early stages.