Repository logoRepository logo
GRO
  • GRO.data
  • GRO.plan
Help
  • English
  • Deutsch
  • Log In
    Members of the University of Göttingen: Login via the GWDG account (user name only, without e-mail/domain extension) and the corresponding password.
    Members of the Göttingen campus institutions:
    • If you have already used GRO.publications, use the email and password you chose at the registration.
    • If you are using GRO.publications for the first time, click the below link to register.
    New user? Click here to register.Have you forgotten your password?
  • Communities & Collections
  • Research Outputs
  • People
  • Organizations
  • Journals
  • Events
  • Projects
 
  • Details
Options

COMBINING CORRELATION-BASED AND REWARD-BASED LEARNING IN NEURAL CONTROL FOR POLICY IMPROVEMENT

ISSN
1793-6802
0219-5259
Date Issued
2013
Author(s)
Manoonpong, Poramate 
Kolodziejski, Christoph
Woergoetter, Florentin 
Morimoto, Jun
DOI
10.1142/S021952591350015X
Abstract
Classical conditioning (conventionally modeled as correlation-based learning) and operant conditioning (conventionally modeled as reinforcement learning or reward-based learning) have been found in biological systems. Evidence shows that these two mechanisms strongly involve learning about associations. Based on these biological findings, we propose a new learning model to achieve successful control policies for artificial systems. This model combines correlation-based learning using input correlation learning (ICO learning) and reward-based learning using continuous actor-critic reinforcement learning (RL), thereby working as a dual learner system. The model performance is evaluated by simulations of a cart-pole system as a dynamic motion control problem and a mobile robot system as a goal-directed behavior control problem. Results show that the model can strongly improve pole balancing control policy, i.e., it allows the controller to learn stabilizing the pole in the largest domain of initial conditions compared to the results obtained when using a single learning mechanism. This model can also find a successful control policy for goal-directed behavior, i.e., the robot can effectively learn to approach a given goal compared to its individual components. Thus, the study pursued here sharpens our understanding of how two different learning mechanisms can be combined and complement each other for solving complex tasks.
google-scholar
Views
Downloads

About

About us
FAQ
ORCID
Site Policy
Privacy Policy
Cookie Consent
Imprint

Contact

Team GRO.publications
support-gro.publications@uni-goettingen.de
Rocket.Chat: #support_gro_publications
Feedback

Göttingen Research Online

Göttingen Research Online bundles various services for Göttingen researchers:

GRO.data (research data repository)
GRO.plan (data management planning)
GRO.publications (publication data repository)
Logo Uni Göttingen
Logo Campus Göttingen
Logo SUB Göttingen
Logo eResearch Alliance

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 4.0 International license.