1- Yazd university
Abstract: (895 Views)
This paper presents a new method for using data collected from the agent's random movement in the environment for the initial adjustment of parameters of a controller with a fuzzy reinforcement learning structure. Slow learning speed and high failure rates during training are two major challenges in such structures. The initial parameterization of the fuzzy system can be a suitable solution to address these challenges. In this paper, the method of discrete value iteration is extended to continuous without relying on derivative based methods to initialize the parameters of the fuzzy system. First, random interaction with the environment is used to collect relevant data. Since the state space is continuous, the data is appropriately clustered and each cluster is considered as a state. Then, by generalizing the standard value iteration method to the continuous, the transition probability matrix and the immediate reward expectation matrix are calculated. Using the results of this stage, the initial parameterization of the fuzzy reinforcement learning structure is performed. Subsequently, these parameters are fine-tuned using reinforcement learning. The proposed method is called "Value Iteration based Fuzzy Reinforcement Learning" and is used in the problem of target following robots. The experimental results indicate a significant improvement in the performance of the proposed method in the problem of target following robots.
Type of Article:
Research paper |
Subject:
Special Received: 2023/12/1 | Accepted: 2024/06/16 | ePublished ahead of print: 2024/07/28