Learning from delayed reward und punishment in a spiking neural network model of basal ganglia with opposing D1/D2 plasticity

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Extending previous work, we introduce a spiking actor-critic network model of learning from reward and punishment in the basal ganglia. In the model, the striatum is taken to be segregated into populations of medium spiny neurons (MSNs) that carry either D1 or D2 dopamine receptor type. This segregation allows explicit representation of both positive and negative expected outcome within the respective population. In line with recent experiments, we further assume that D1 and D2 MSN populations have opposing dopamine-modulated bidirectional synaptic plasticity. Experiments were conducted in a grid world, where a moving agent had to reach a remote rewarded goal state. The network learned not only to approach the rewarded goal, but also to consequently avoid punishments as opposed to the previous model. The spiking network model explains functional role of D1/D2 MSN segregation within striatum, specifically the reversed direction of dopamine-dependent plasticity found at synapses converging on different MSNs. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Jitsev, J., Abraham, N., Morrison, A., & Tittgemeyer, M. (2012). Learning from delayed reward und punishment in a spiking neural network model of basal ganglia with opposing D1/D2 plasticity. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7552 LNCS, pp. 459–466). https://doi.org/10.1007/978-3-642-33269-2_58

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free