Closed-Loop Q-Learning Control with Spiking Neuromorphic Network

EasyChair Preprint 15674

6 pages•Date: January 6, 2025

Giovanni Michel, Steven Nesbit and Andrew Sornborger

Abstract

Neuromorphic processors offer a promising, low-power alternative to standard von Neumann architectures. For these processors to be effectively used, it is important to enable end-to-end processing entirely on-chip to avoid the dominant power consumption of standard computational components. For this reason, control problems in autonomous agents present a compelling domain for applying neuromorphic solutions. In this simulation study, we introduce a closed-loop, spiking implementation of Q-learning. Here, we study a proof-of-principle problem: cartpole balancing. Our approach uses the OpenAI Gym for off-chip, in-the-loop simulation of cartpole dynamics. Unlike our previous work in which the Q-learning matrix was learned entirely off-chip, then transferred on-chip for testing, we now showcase an entirely spike-based training implemented in Intel’s Lava Software Framework -- software for neuromorphic simulation. We show that the agent can learn to balance the cartpole well after training. Our spiking implementation is a first step towards full, on-chip Q-learning.

Keyphrases: Hebbian learning, Reinforcement Learning, Spiking Neural Networks, q-learning control, sparse coding, synfire-gated synfire chains

Links:

https://easychair.org/publications/preprint/kHNct

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:15674,
  author    = {Giovanni Michel and Steven Nesbit and Andrew Sornborger},
  title     = {Closed-Loop Q-Learning Control with Spiking Neuromorphic Network},
  howpublished = {EasyChair Preprint 15674},
  year      = {EasyChair, 2025}}

Download PDF Open PDF in browser