{ "items": [ "\n\n
Training with backpropagation (BP) in standard deep learning consists of two main steps: a forward pass that maps a data point to its prediction, and a backward pass that propagates the error of this prediction back through the network. This process is highly effective when the goal is to minimize a specific objective function. However, it does not allow training on networks with cyclic or backward connections. This is an obstacle to reaching brain-like capabilities, as the highly complex heterarchical structure of the neural connections in the neocortex are potentially fundamental for its effectiveness. In this paper, we show how predictive coding (PC), a theory of information processing in the cortex, can be used to perform inference and learning on arbitrary graph topologies. We experimentally show how this formulation, called PC graphs, can be used to flexibly perform different tasks with the same network by simply stimulating specific neurons. This enables the model to be queried on stimuli with different structures, such as partial images, images with labels, or images without labels. We conclude by investigating how the topology of the graph influences the final performance, and comparing against simple baselines trained with BP.
\n \n\n \n \nA large number of neural network models of associative memory have been proposed in the literature. These include the classical Hopfield networks (HNs), sparse distributed memories (SDMs), and more recently the modern continuous Hopfield networks (MCHNs), which possess close links with self-attention in machine learning. In this paper, we propose a general framework for understanding the operation of such memory networks as a sequence of three operations: similarity, separation, and projection. We derive all these memory models as instances of our general framework with differing similarity and separation functions. We extend the mathematical framework of Krotov & Hopfield (2020) to express general associative memory models using neural network dynamics with local computation, and derive a general energy function that is a Lyapunov function of the dynamics. Finally, using our framework, we empirically investigate the capacity of using different similarity functions for these associative memory models, beyond the dot product similarity measure, and demonstrate empirically that Euclidean or Manhattan distance similarity metrics perform substantially better in practice on many tasks, enabling a more robust retrieval and higher memory capacity than existing models.
\n \n\n \n \nAbstractEssential tremor manifests predominantly as a tremor of the upper limbs. One therapy option is high-frequency deep brain stimulation, which continuously delivers electrical stimulation to the ventral intermediate nucleus of the thalamus at about 130 Hz. Investigators have been looking at stimulating less, chiefly to reduce side effects. One strategy, phase-locked deep brain stimulation, consists of stimulating according to the phase of the tremor, once per period. In this study, we aim to reproduce the phase dependent effects of stimulation seen in patient data with a biologically inspired Wilson-Cowan model. To this end, we first analyse patient data, and conclude that half of the datasets have response curves that are better described by sinusoidal curves than by straight lines, while an effect of phase cannot be consistently identified in the remaining half. Using the Hilbert phase we derive analytical expressions for phase and amplitude responses to phase-dependent stimulation and study their relationship in the linearisation of a stable focus model, a simplification of the Wilson-Cowan model in the stable focus regime. Analytical results provide a good approximation for response curves observed in patients with consistent significance. Additionally, we fitted the full non-linear Wilson-Cowan model to these patients, and we show that the model can fit in each case to the dynamics of patient tremor as well as the phase response curve, and the best fits are found to be stable foci for each patients (tied best fit in one instance). The model provides satisfactory prediction of how patient tremor will react to phase-locked stimulation by predicting patient amplitude response curves although they were not explicitly fitted. This can be partially explained by the relationship between the response curves in the model being compatible with what is found in the data. We also note that the non-linear Wilson-Cowan model is able to describe response to stimulation more precisely than the linearisation.
\n \n\n \n \nIn many perceptual and cognitive decision-making problems, humans sample multiple noisy information sources serially, and integrate the sampled information to make an overall decision. We derive the optimal decision procedure for two-alternative choice tasks in which the different options are sampled one at a time, sources vary in the quality of the information they provide, and the available time is fixed. To maximize accuracy, the optimal observer allocates time to sampling different information sources in proportion to their noise levels. We tested human observers in a corresponding perceptual decision-making task. Observers compared the direction of two random dot motion patterns that were triggered only when fixated. Observers allocated more time to the noisier pattern, in a manner that correlated with their sensory uncertainty about the direction of the patterns. There were several differences between the optimal observer predictions and human behaviour. These differences point to a number of other factors, beyond the quality of the currently available sources of information, that influences the sampling strategy. \u00a9 2013 Cassey et al.
\n \n\n \n \nWe introduce a new approach to modelling decision confidence, with the aim of enabling computationally cheap predictions while taking into account, and thereby exploiting, trial-by-trial variability in stochastically fluctu- ating stimuli. Using the framework of the drift diffusion model of decision making, along with time-dependent thresholds and the idea of a Bayesian confidence readout, we derive expressions for the probability distribution over confidence reports. In line with current models of confidence, the derivations allow for the accumulation of \u201cpipeline\u201d evidence that has been received but not processed by the time of response, the effect of drift rate vari- ability, and metacognitive noise. The expressions are valid for stimuli that change over the course of a trial with normally-distributed fluctuations in the evidence they provide. A number of approximations are made to arrive at the final expressions, and we test all approximations via simulation. The derived expressions contain only a small number of standard functions, and require evaluating only once per trial, making trial-by-trial modelling of confidence data in stochastically fluctuating stimuli tasks more feasible. We conclude by using the expressions to gain insight into the confidence of optimal observers, and empirically observed patterns.
\n \n\n \n \nMuch evidence indicates that the perirhinal cortex is involved in the familiarity discrimination aspect of recognition memory. It has been previously shown under selective conditions that neural networks performing familiarity discrimination can achieve very high storage capacity, being able to deal with many times more stimuli than associative memory networks can in associative recall. The capacity of associative memories for recall has been shown to be highly dependent on the sparseness of coding. However, previous work on the networks of Bogacz et al, Norman and O'Reilly and Sohal and Hasselmo that model familiarity discrimination in the perirhinal cortex has not investigated the effects of the sparseness of encoding on capacity. This paper explores how sparseness of coding influences the capacity of each of these published models and establishes that sparse coding influences the capacity of the different models in different ways. The capacity of the Bogacz et al model can be made independent of the sparseness of coding. Capacity increases as coding becomes sparser for a simplified version of the neocortical part of the Norman and O'Reilly model, whereas capacity decreases as coding becomes sparser for a simplified version of the Sohal and Hasselmo model. Thus in general, and in contrast to associative memory networks, sparse encoding results in little or no advantage for the capacity of familiarity discrimination networks. Hence it may be less important for coding to be sparse in the perirhinal cortex than it is in the hippocampus. Additionally, it is established that the capacities of the networks are strongly dependent on the precise form of the learning rules (synaptic plasticity) used in the network. This finding indicates that the precise characteristics of synaptic plastic changes in the real brain are likely to have major influences on storage capacity.
\n \n\n \n \nThe backpropagation of error algorithm used to train deep neural networks has been fundamental to the successes of deep learning. However, it requires sequential backward updates and non-local computations, which make it challenging to parallelize at scale and is unlike how learning works in the brain. Neuroscience-inspired learning algorithms, however, such as predictive coding, which utilize local learning, have the potential to overcome these limitations and advance beyond deep learning technologies in the future. While predictive coding originated in theoretical neuroscience as a model of information processing in the cortex, recent work has developed the idea into a general-purpose algorithm able to train deep neural networks using only local computations. In this survey, we review works that have contributed to this perspective and demonstrate the close connection between predictive coding and backpropagation in terms of generalization quality, as well as works that highlight the multiple advantages of using predictive coding over backpropagation-trained neural networks. Specifically, we show the substantially greater flexibility of predictive coding networks, which, unlike standard deep neural networks, can function as classifiers, generators, and associative memories simultaneously, and can be defined on arbitrary graph topologies. Finally, we review direct benchmarks of predictive coding networks on machine learning classification tasks, as well as its close connections to control theory and applications in robotics.
\n \n\n \n \nAbstractDeep brain stimulation (DBS) is known to be an effective treatment for a variety of neurological disorders, including Parkinson\u2019s disease and essential tremor (ET). At present, it involves administering a train of pulses with constant frequency via electrodes implanted into the brain. New \u2018closed-loop\u2019 approaches involve delivering stimulation according to the ongoing symptoms or brain activity and have the potential to provide improvements in terms of efficiency, efficacy and reduction of side effects. The success of closed-loop DBS depends on being able to devise a stimulation strategy that minimizes oscillations in neural activity associated with symptoms of motor disorders. A useful stepping stone towards this is to construct a mathematical model, which can describe how the brain oscillations should change when stimulation is applied at a particular state of the system. Our work focuses on the use of coupled oscillators to represent neurons in areas generating pathological oscillations. Using a reduced form of the Kuramoto model, we analyse how a patient should respond to stimulation when neural oscillations have a given phase and amplitude. We predict that, provided certain conditions are satisfied, the best stimulation strategy should be phase specific but also that stimulation should have a greater effect if applied when the amplitude of brain oscillations is lower. We compare this surprising prediction with data obtained from ET patients. In light of our predictions, we also propose a new hybrid strategy which effectively combines two of the strategies found in the literature, namely phase-locked and adaptive DBS.Author summaryDeep brain stimulation (DBS) involves delivering electrical impulses to target sites within the brain and is a proven therapy for a variety of neurological disorders. Closed loop DBS is a promising new approach where stimulation is applied according to the state of a patient. Crucial to the success of this approach is being able to predict how a patient should respond to stimulation. Our work focusses on DBS as applied to patients with essential tremor (ET). On the basis of a theoretical model, which describes neurons as oscillators that respond to stimulation and have a certain tendency to synchronize, we provide predictions for how a patient should respond when stimulation is applied at a particular phase and amplitude of the ongoing tremor oscillations. Previous experimental studies of closed loop DBS provided stimulation either on the basis of ongoing phase or amplitude of pathological oscillations. Our study suggests how both of these measurements can be used to control stimulation. As part of this work, we also look for evidence for our theories in experimental data and find our predictions to be satisfied in one patient. The insights obtained from this work should lead to a better understanding of how to optimise closed loop DBS strategies.
\n \n\n \n \nHuman decisions can be reflexive or planned, being governed respectively by model-free and model-based learning systems. These two systems might differ in their responsiveness to our needs. Hunger drives us to specifically seek food rewards, but here we ask whether it might have more general effects on these two decision systems. On one hand, the model-based system is often considered flexible and context-sensitive, and might therefore be modulated by metabolic needs. On the other hand, the model-free system\u2019s primitive reinforcement mechanisms may have closer ties to biological drives. Here, we tested participants on a well-established two-stage sequential decision-making task that dissociates the contribution of model-based and model-free control. Hunger enhanced overall performance by increasing model-free control, without affecting model-based control. These results demonstrate a generalised effect of hunger on decision-making that enhances reliance on primitive reinforcement learning, which in some situations translates into adaptive benefits. Significance statement The prevalence of obesity and eating disorder is steadily increasing. To counteract problems related to eating, people need to make rational decisions. However, appetite may switch us to a different decision mode, making it harder to achieve long-term goals. Here we show that planned and reinforcement-driven actions are differentially sensitive to hunger. Hunger specifically affected reinforcement-driven actions, and did not affect the planning of actions. Our data shows that people behave differently when they are hungry. We also provide a computational model of how the behavioural changes might arise.
\n \n\n \n \nWe assess risks differently when they are explicitly described, compared to when we learn directly from experience, suggesting dissociable decision-making systems. Our needs, such as hunger, could globally affect our risk preferences, but do they affect described and learned risks equally? On one hand, explicit decision-making is often considered flexible and contextsensitive, and might therefore be modulated by metabolic needs. On the other hand, implicit preferences learned through reinforcement might be more strongly coupled to biological drives. To answer this, we asked participants to choose between two options with different risks, where the probabilities of monetary outcomes were either described or learned. In agreement with previous studies, rewarding contexts induced risk-aversion when risks were explicitly described, but risk-seeking when they were learned through experience. Crucially, hunger attenuated these contextual biases, but only for learned risks. The results suggest that our metabolic state determines risk-taking biases when we lack explicit descriptions.
\n \n\n \n \nTo accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should be updated to a smaller extent after each observation. However, it is not known how the magnitude of the observation noise might be tracked and used to control prediction updates in the brain reward system. Here, we introduce a new model that uses simple, tractable learning rules that track the mean and standard deviation of reward, and leverages prediction errors scaled by uncertainty as the central feedback signal. We show that the new model has an advantage over conventional reinforcement learning models in a value tracking task, and approaches a theoretic limit of performance provided by the Kalman filter. Further, we propose a possible biological implementation of the model in the basal ganglia circuit. In the proposed network, dopaminergic neurons encode reward prediction errors scaled by standard deviation of rewards. We show that such scaling may arise if the striatal neurons learn the standard deviation of rewards and modulate the activity of dopaminergic neurons. The model is consistent with experimental findings concerning dopamine prediction error scaling relative to reward magnitude, and with many features of striatal plasticity. Our results span across the levels of implementation, algorithm, and computation, and might have important implications for understanding the dopaminergic prediction error signal and its relation to adaptive and effective learning.
\n \n\n \n \nHuman decisions can be reflexive or planned, being governed respectively by model-free and model-based learning systems. These two systems might differ in their responsiveness to our needs. Hunger drives us to specifically seek food rewards, but here we ask whether it might have more general effects on these two decision systems. On one hand, the model-based system is often considered flexible and context-sensitive, and might therefore be modulated by metabolic needs. On the other hand, the model-free system's primitive reinforcement mechanisms may have closer ties to biological drives. Here, we tested participants on a well-established two-stage sequential decision-making task that dissociates the contribution of model-based and model-free control. Hunger enhanced overall performance by increasing model-free control, without affecting model-based control. These results demonstrate a generalized effect of hunger on decision-making that enhances reliance on primitive reinforcement learning, which in some situations translates into adaptive benefits.
\n \n\n \n \nOBJECTIVES: The exact mechanisms of deep brain stimulation (DBS) are still an active area of investigation, in spite of its clinical successes. This is due in part to the lack of understanding of the effects of stimulation on neuronal rhythms. Entrainment of brain oscillations has been hypothesised as a potential mechanism of neuromodulation. A better understanding of entrainment might further inform existing methods of continuous DBS, and help refine algorithms for adaptive methods. The purpose of this study is to develop and test a theoretical framework to predict entrainment of cortical rhythms to DBS across a wide range of stimulation parameters. MATERIALS AND METHODS: We fit a model of interacting neural populations to selected features characterising PD patients' off-stimulation finely-tuned gamma rhythm recorded through electrocorticography. Using the fitted models, we predict basal ganglia DBS parameters that would result in 1:2 entrainment, a special case of sub-harmonic entrainment observed in patients and predicted by theory. RESULTS: We show that the neural circuit models fitted to patient data exhibit 1:2 entrainment when stimulation is provided across a range of stimulation parameters. Furthermore, we verify key features of the region of 1:2 entrainment in the stimulation frequency/amplitude space with follow-up recordings from the same patients, such as the loss of 1:2 entrainment above certain stimulation amplitudes. CONCLUSION: Our results reveal that continuous, constant frequency DBS in patients may lead to nonlinear patterns of neuronal entrainment across stimulation parameters, and that these responses can be predicted by modelling. Should entrainment prove to be an important mechanism of therapeutic stimulation, our modelling framework may reduce the parameter space that clinicians must consider when programming devices for optimal benefit.
\n \n\n \n \nAbstractTo optimally adjust our behavior to changing environments we need to both adjust the speed of our decisions and movements. Yet little is known about the extent to which these processes are controlled by common or separate mechanisms. Furthermore, while previous evidence from computational models and empirical studies suggests that the basal ganglia play an important role during adjustments of decision-making, it remains unclear how this is implemented. Leveraging the opportunity to directly access the subthalamic nucleus of the basal ganglia in humans undergoing deep brain stimulation surgery, we here combine invasive electrophysiological recordings, electrical stimulation and computational modelling of perceptual decision-making. We demonstrate that, while similarities between subthalamic control of decision- and movement speed exist, the causal contribution of the subthalamic nucleus to these processes can be disentangled. Our results show that the basal ganglia independently control the speed of decisions and movement for each hemisphere during adaptive behavior.
\n \n\n \n \nPeriodic features of neural time-series data, such as local field potentials (LFPs), are often quantified using power spectra. While the aperiodic exponent of spectra is typically disregarded, it is nevertheless modulated in a physiologically relevant manner and was recently hypothesised to reflect excitation/inhibition (E/I) balance in neuronal populations. Here, we used a cross-species in vivo electrophysiological approach to test the E/I hypothesis in the context of experimental and idiopathic Parkinsonism. We demonstrate in dopamine-depleted rats that aperiodic exponents and power at 30\u2013100 Hz in subthalamic nucleus (STN) LFPs reflect defined changes in basal ganglia network activity; higher aperiodic exponents tally with lower levels of STN neuron firing and a balance tipped towards inhibition. Using STN-LFPs recorded from awake Parkinson\u2019s patients, we show that higher exponents accompany dopaminergic medication and deep brain stimulation (DBS) of STN, consistent with untreated Parkinson\u2019s manifesting as reduced inhibition and hyperactivity of STN. These results suggest that the aperiodic exponent of STN-LFPs in Parkinsonism reflects E/I balance and might be a candidate biomarker for adaptive DBS.
\n \n\n \n \n