یادگیری تقویتی فازی مبتنی بر تکرار ارزش در ربات تعقیب کننده‌ی هدف

نادی, فرزانه; درهمی, ولی; اعلمی‌یان هرندی, فریناز

دوره 18، شماره 2 - ( مجله کنترل، جلد 18، شماره 2، تابستان 1403 ) جلد 18 شماره 2,1403 صفحات 12-1 | برگشت به فهرست نسخه ها

Mendeley

Zotero

RefWorks

Nadi F, Derhami V, Alamiyan Harandi F. Value Iteration based Fuzzy Reinforcement Learning in Target Following Robot. JoC 2024; 18 (2) :1-12
URL: http://joc.kntu.ac.ir/article-1-1012-fa.html

نادی فرزانه، درهمی ولی، اعلمی‌یان هرندی فریناز. یادگیری تقویتی فازی مبتنی بر تکرار ارزش در ربات تعقیب کننده‌ی هدف. مجله کنترل. 1403; 18 (2) :1-12

URL: http://joc.kntu.ac.ir/article-1-1012-fa.html

یادگیری تقویتی فازی مبتنی بر تکرار ارزش در ربات تعقیب کننده‌ی هدف

فرزانه نادی¹

، ولی درهمی^*¹

، فریناز اعلمی‌یان هرندی²

1- دانشکده مهندسی کامپیوتر، دانشگاه یزد، یزد، ایران
2- دانشکده مهندسی برق و کامپیوتر، دانشگاه صنعتی اصفهان، اصفهان، ایران

چکیده: (2158 مشاهده)

این مقاله روشی جدید در استفاده از داده‌های جمع آوری شده از حرکت تصادفی عامل در محیط برای تنظیم اولیه‌ی پارامترهای یک کنترلگر با ساختار یادگیری تقویتی فازی ارائه می‌دهد. کندی سرعت آموزش و تعداد شکست بالا در زمان آموزش دو چالش مهم در این قبیل ساختارها هستند. مقداردهی اولیه‌ی پارامترهای سیستم فازی می‌تواند راهکار مناسبی برای رفع این چالش‌ها باشد. در این مقاله با تعمیم روش تکرار ارزش گسسته به پیوسته بدون بهره‌گیری از روش‌های مبتنی بر مشتق، پارامترهای سیستم فازی مقدار دهی اولیه می‌شوند. ابتدا با تعامل تصادفی عامل با محیط داده‌های مرتبط جمع‌آوری می‌شود. با توجه به آنکه فضای حالت پیوسته است، داده‌ها به طور مناسب خوشه بندی شده و هر خوشه به عنوان یک حالت لحاظ می‌گردد. آنگاه با تعمیم روش تکرار ارزش استاندارد به پیوسته ماتریس احتمال انتقال حالت-عمل به حالت بعدی و امید پاداش آنی حالت-عمل به حالت بعدی محاسبه می‌شود. با استفاده از نتایج این مرحله پارامترهای ساختار یادگیری تقویتی فازی مقدار دهی اولیه می‌شوند. پس آز آن پارامترهای این ساختار به صورت برخط با روش یادگیری تقویتی تنظیم نهایی می‌گردند. روش ارایه شده "یادگیری تقویتی فازی مبتنی بر تکرار ارزش" نامیده می‌شود و در مسئله‌ی ربات تعقیب کننده‌ی هدف مورد استفاده قرار می‌گیرد. نتایج آزمایش‌ها حاکی از بهبود قابل توجه عملکرد روش ارائه شده در مسئله‌ی ربات تعقیب کننده‌ی هدف است.

واژه‌های کلیدی: کنترلگر فازی، یادگیری تقویتی، برنامه‌سازی پویا، خوشه‌بندی، ربات تعقیب کننده‌ی هدف

متن کامل [PDF 796 kb] (237 دریافت)

نوع مطالعه: پژوهشي | موضوع مقاله: تخصصي
دریافت: 1402/9/10 | پذیرش: 1403/3/27 | انتشار الکترونیک پیش از انتشار نهایی: 1403/5/7 | انتشار: 1403/6/30

فهرست منابع

1. [1] M. F. R. Lee, & Y. C. Chen, "Artificial Intelligence Based Object Detection and Tracking for a Small Underwater Robot". Processes, vol. 11, no. 2, pp. 312, 2023. [DOI:10.3390/pr11020312]

2. [2] S. Li, K. Milligan & et al., "Exploring the role of human-following robots in supporting the mobility and wellbeing of older people". Scientific Reports, vol. 13, no. 1, pp. 6512, 2023. [DOI:10.1038/s41598-023-33837-1]

3. [3] G. Thomas, R. Gade, T. B. Moeslund & et al., "Computer vision for sports: Current applications and research topics". Computer Vision and Image Understanding, vol. 159, pp. 3-18, 2017. [DOI:10.1016/j.cviu.2017.04.011]

4. [4] H. Kivrak, F. Cakmak, H. Kose & S. Yavuz, "Social navigation framework for assistive robots in human inhabited unknown environments". The International Journal Engineering Science and Technology, vol. 24, no. 2, pp. 284-298, 2021. [DOI:10.1016/j.jestch.2020.08.008]

5. [5] Tempo Walk in Clubcar. Available online: https://www.clubcar.com/en-us/golf-operations/fleet-golf/tempo-walk (accessed on 14 November 2023).

6. [6] A. Rudenko, L. Palmieri & et al., "Human motion trajectory prediction: A survey". The International Journal of Robotics Research, vol. 39, no. 8, pp. 895-935, 2020. [DOI:10.1177/0278364920917446]

7. [7] M. J. Islam, J. Hong & J. Sattar, "Person-following by autonomous robots: A categorical overview". The International Journal of Robotics Research, vol. 38, no. 14, pp. 1581-1618, 2019. [DOI:10.1177/0278364919881683]

8. [8] R. Algabri & M. T. Choi, "Deep-learning-based indoor human following of mobile robot using color feature". Sensors, vol. 20, no. 9, pp. 2699, 2020. [DOI:10.3390/s20092699]

9. [9] D. Cha & W. Chung, "Human-leg detection in 3D feature space for a person-following mobile robot using 2D LiDARs. International Journal of Precision Engineering and Manufacturing", vol. 21, pp. 1299-1307, 2020. [DOI:10.1007/s12541-020-00343-7]

10. [10] A. Eirale, M. Martini, & M. Chiaberge, "Human-centered navigation and person-following with omnidirectional robot for indoor assistance and monitoring". Robotics, vol. 11, no. 5, pp. 108, 2022. [DOI:10.3390/robotics11050108]

11. [11] J. Liu, X. Chen & et al., "A person-following method based on monocular camera for quadruped robots". Biomimetic Intelligence and Robotics, vol. 2, no. 3, 2022. [DOI:10.1016/j.birob.2022.100058]

12. [12] K. Koide, J. Miura & E. Menegatti, "Monocular person tracking and identification with on-line deep feature selection for person following robots". Robotics and Autonomous Systems, vol. 124, 2020. [DOI:10.1016/j.robot.2019.103348]

13. [13] F. Alamiyan-Harandi, V. Derhami, & F. Jamshidi, "A new feature selection method based on task environments for controlling robots". Applied Soft Computing, vol. 85, 2019. [DOI:10.1016/j.asoc.2019.105812]

14. [14] C. A. Yang & K. T. Song, "Control design for robotic human-following and obstacle avoidance using an RGB-D camera". 19th IEEE International Conference on Control, Automation and Systems (ICCAS), pp. 934-939, 2019. [DOI:10.23919/ICCAS47443.2019.8971754]

15. [15] B. J. Lee, J. Choi, C. Baek & B. T. Zhang, "Robust human following by deep Bayesian trajectory prediction for home service robots". IEEE international conference on robotics and automation (ICRA), pp. 7189-7195, 2018. [DOI:10.1109/ICRA.2018.8462969]

16. [16] B. X. Chen, R. Sahdev & J. K. Tsotsos, "Integrating stereo vision with a CNN tracker for a person-following robot". 11th International Conference on Computer Vision Systems, Springer International Publishing, pp. 300-313, 2017. [DOI:10.1007/978-3-319-68345-4_27]

17. [17] B. X. Chen, "Real-time Online Human Tracking with a Stereo Camera for Person-Following Robots", 2019.

18. [18] F. Nadi, F. Alamiyan-Harandi, V. Derhami, F. Taherizade, "Improving Performance of Target Following Robot using Visual Servoing Fuzzy Controller (In Persian), 3rd International Conference on Soft Computing, 2019.

19. [19] J. H. Choi, K. Samuel, K. Nam & S. Oh, "An autonomous human following caddie robot with high-level driving functions". Electronics, vol. 9, no. 9, pp. 1516, 2020. [DOI:10.3390/electronics9091516]

20. [20] X. Gu, J. Han, Q. Shen & P. P. Angelov, "Autonomous learning for fuzzy systems: a review". Artificial Intelligence Review, vol. 56, no. 8, pp. 7549-7595, 2023. [DOI:10.1007/s10462-022-10355-6]

21. [21] H. Hu, X. Wang & L. Chen, "Impedance with finite-time control scheme for robot-environment interaction". Mathematical Problems in Engineering, 2020. [DOI:10.1155/2020/2796590]

22. [22] J. Lin, J. Zhou, M. Lu, H. Wang & A. Yi, "Design of robust adaptive fuzzy controller for a class of single-input single-output (siso) uncertain nonlinear systems". Mathematical Problems in Engineering, pp. 1-11, 2020. [DOI:10.1155/2020/6178678]

23. [23] T. V. Nguyen, M. H. Do & J. Jo, "Robust-adaptive-behavior strategy for human-following robots in unknown environments based on fuzzy inference mechanism". Industrial Robot: the international journal of robotics research and application, vol. 49, no. 6, pp. 1089-1100, 2022. [DOI:10.1108/IR-01-2022-0009]

24. [24] N. Van Toan, M. Do Hoang, P. B. Khoi & S. Y. Yi, "The human-following strategy for mobile robots in mixed environments". Robotics and Autonomous Systems, vol. 160, 2023. [DOI:10.1016/j.robot.2022.104317]

25. [25] V. Derhami, V. J. Majd & M. N. Ahmadabadi, "Fuzzy Sarsa learning and the proof of existence of its stationary points". Asian Journal of Control, vol. 10, no. 5, pp. 535-549, 2008. [DOI:10.1002/asjc.54]

26. [26] F. Fathinezhad, V. Derhami & M. Rezaeian, "Supervised fuzzy reinforcement learning for robot navigation". Applied Soft Computing, vol. 40, pp. 33-41, 2016. [DOI:10.1016/j.asoc.2015.11.030]

27. [27] D. Song, B. Zhu, J. Zhao & et al., (2023). "Personalized Car-Following Control Based on a Hybrid of Reinforcement Learning and Supervised Learning". IEEE Transactions on Intelligent Transportation Systems, 2023. [DOI:10.1109/TITS.2023.3245362]

28. [28] F. Nadi, V. Derhami & F. Alamiyan-Harandi, "Coarse Tuning of Fuzzy Reinforcement Learning Architecture using Value Iteration Method". Fuzzy Systems and its Applications, vol. 6, no. 1, pp. 109-126, 2023.

29. [29] V. Derhami, F. Alamiyan-Harandi, & M. Dowlatshahi, "Reinforcement Learning" (In Persian), Yazd University press, 2017.

30. [30] R. S. Sutton and A. G. Barto. "Reinforcement learning: An introduction". MIT press Cambridge, 1998. [DOI:10.1109/TNN.1998.712192]

31. [31] F. Alamiyan-Harandi, V. Derhami, "A reinforcement learning algorithm for adjusting antecedent parameters and weights of fuzzy rules in a fuzzy classifier", Journal of Intelligent & Fuzzy Systems, vol. 30, no. 4, pp. 2339-2347, 2016. [DOI:10.3233/IFS-152004]

32. [32] R. A. Brooks, "A robust layered control system for a mobile robot", IEEE Journal of Robotics and Automation 2, pp. 14-23, 1986. [DOI:10.1109/JRA.1986.1087032]

ارسال پیام به نویسنده مسئول

بازنشر اطلاعات
	این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این وب سایت متعلق به مجله کنترل می باشد.

طراحی و برنامه نویسی : یکتاوب افزار شرق

Designed & Developed by : Yektaweb

پایگاه های مرتبط

کلمات کلیدی