Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

How do we decide what to do in any given situation? We need to work out the payoffs and costs of actions – what we will gain and what it might cost us. But what is the exact nature of the mechanism in the brain which weighs up these payoffs and costs?

Rafal Bogacz and Mortiz Möller have unified two existing models describing what the basal ganglia learns and how it does it. 

We strongly suspect that the mechanism for evaluating the payoffs and costs must lie in the basal ganglia, a set of structures underneath the cortex which is related to the initiation and inhibition of movements. It is thought that the possible payoffs and costs of a particular action are ‘encoded’ here by connections from the rest of the brain  to two distinct populations of brain cells (neurons). These neurons are affected by the ‘dopaminergic signal’ which carries information about the motivational state. So, the neurons in the ‘Go’ pathway are excited by dopamine, and the neurons in the ‘No-Go’ pathway are inhibited by it. It is dopamine which changes the balance between the pathways and inclines us either to take a particular action or to avoid it.

For example, say the opportunity arises to pick an apple from a tree. Clearly, the payoff would be the nutrients available from eating the apple. The costs would be both the effort of climbing the tree and the risk associated with it. Nutrients are only really deemed valuable, however, if we’re hungry. If we are hungry, the payoff will be weighted more than the costs. If we’re not hungry, the costs will be weighted more than the payoff – which shows how motivation affects our decision-making.

But how exactly does the brain work out what the payoffs and costs of particular actions would be? How does it form the two sets of connections (those that form the ‘Go’ and ‘No-Go’ pathways)? This paper suggests that payoffs and costs are learned through  reward prediction errors. So, to go back to our example, the first time you see an apple, you may overestimate its value. Next time you see an apple, you will have adjusted your estimation of its value, based on previous experience. Similarly, you tweak your estimation of how much effort it will cost you to get the apple. The surprise you feel when your prediction is not accurate is crucial to trigger learning.

The weights of the neurons in the ‘Go’ and ‘No-Go’ pathways are modified differently depending on what the brain has learned from repeated reward prediction. Brain plasticity comes into play as the brain learns from what happens in various situations. Without this knowledge, we would have a hard time trying to make good decisions.

This paper also reveals how important it is that the experiences of the payoff and cost associated with a particular action typically take place at different moments in time. You might find that the effort needed to pick the apple is more than you expected; seconds later, you might discover that the apple tastes even better than you thought it would. The brain is receiving two distinct pieces of information, and learns about positive and negative consequences by taking advantage of the fact that payoffs and costs seldom happen simultaneously (if they did, they would cancel each other out and there would be no learning). 

Rather than proposing a new model to describe how the basal ganglia ‘learns’, this paper unifies two existing models describing what the basal ganglia learns and how it does it. The first model suggests that the basal ganglia acquires reasons to approach, and reasons to avoid, frequently arising opportunities. The payoffs and costs are weighed up according to motivational state. The second model sets out plasticity rules for the basal ganglia: how the learning takes place, biologically. But how does the brain then make use of what is learned? Our new research puts these two models and theories together, providing a flexible model of learning and decision-making which takes into account the motivational state as well as the learned representations of the payoffs and costs.

This article by Jacqueline Pumphrey is an accessible summary of Möller & Bogacz 2019.

Similar stories

Insights into the molecular pathways of progressive multiple sclerosis

Text by Ian Fyfe for 'Nature Reviews Neurology'

Discovery of gene involved in chronic pain creates new treatment target

Our researchers have discovered a gene that regulates pain sensitisation by amplifying pain signals within the spinal cord. This is helping them to understand an important mechanism underlying chronic pain in humans, and provides a new treatment target.

Lymph nodes reveal more about mechanisms of autoimmunity

Two recent papers show that studying lymph nodes reveals details of the mechanisms of autoimmunity.

Multiple heart-related conditions linked to triple dementia risk, regardless of genetics

Having multiple conditions that affect the heart is linked to a greater risk of dementia than having high genetic risk, according to a large-scale new study.

NDCN research presented at Myasthenia Gravis conference

The 14th Quinquennial Myasthenia Gravis Federation of America International Conference was recently held in Miami with 450 delegates attending in person, including over 100 from industry.

Magnetic signatures of the brain characterised in UK Biobank imaging study

A study published this week in Nature Neuroscience demonstrates how studying the magnetic properties of tissue may provide a unique window into brain health and disease.