Volume 14, Issue 4 (Journal of Control, V.14, N.4 Winter 2021)                   JoC 2021, 14(4): 55-66 | Back to browse issues page

XML Persian Abstract Print

Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Nikanjam A, Abdoos M, Mahdavi Moghadam M. Collaborative Multi-Agent Reinforcement Learning in Dynamic Environments using Knowledge Transfer for Herding Problem. JoC. 2021; 14 (4) :55-66
URL: http://joc.kntu.ac.ir/article-1-642-en.html
1- K. N. Toosi University of Technology
2- Shahid Beheshti University
Abstract:   (2552 Views)
Nowadays, collaborative multi-agent systems in which a group of agents work together to reach a common goal, are used to solve a wide range of problems. Cooperation between agents will bring benefits such as reduced operational costs, high scalability and significant adaptability. Usually, reinforcement learning is employed to achieve an optimal policy for these agents. Learning in collaborative multi-agent dynamic environments with large and stochastic state spaces has become a major challenge in many applications. These challenges include the effect of size of state space on learning time, ineffective collaboration between agents and the lack of appropriate coordination between decisions of agents. On the other hand, using reinforcement learning has challenges such as the difficulty of determination the appropriate learning goal or reward and the longtime of convergence due to the trial and error in learning. This paper, by introducing a communication framework for collaborative multi-agent systems, attempts to address some of these challenges in herding problem. To handle the problems of convergence, knowledge transfer has been utilized that can significantly increase the efficiency of reinforcement learning algorithms. Cooperation and Coordination and between the agents is carried out through the existence of a head agent in each group of agents and a coordinator agent respectively. This framework has been successfully applied to herding problem instances and experimental results have revealed a significant improvement in the performance of agents.
Full-Text [PDF 934 kb]   (310 Downloads)    
Type of Article: Research paper | Subject: Special
Received: 2019/01/20 | Accepted: 2019/12/26 | ePublished ahead of print: 2020/10/5 | Published: 2021/02/19

1. [1] Glavic, M., "Agents and multi-agent systems: A short introduction for power engineers", University of Liege-Electrical engineering and computer science department, 2006.
2. [2] Celiberto Jr, Luiz A., Jackson P. Matsuura, Ramón López De Màntaras, and Reinaldo AC Bianchi. "Using transfer learning to speed-up reinforcement learning: a cased-based approach." In Robotics Symposium and Intelligent Robotic Meeting (LARS), 2010 Latin American, pp. 55-60. IEEE, 2010.
3. [3] Taylor, Matthew E., and Peter Stone. "Transfer learning for reinforcement learning domains: A survey" Journal of Machine Learning Research, 10, 1633-1685, 2009.
4. [4] Wu, Jun, Xin Xu, Pengcheng Zhang, and Chunming Liu. "A novel multi-agent reinforcement learning approach for job scheduling in Grid computing." Future Generation Computer Systems, 27(5), 430-439, 2011. [DOI:10.1016/j.future.2010.10.009]
5. [5] Khamis, Mohamed A., and Walid Gomaa. "Adaptive multi-objective reinforcement learning with hybrid exploration for traffic signal control based on cooperative multi-agent framework." Engineering Applications of Artificial Intelligence, 29, 134-151, 2014. [DOI:10.1016/j.engappai.2014.01.007]
6. [6] Kachroo, Pushkin, Samy A. Shedied, John S. Bay, and Hugh Vanlandingham. "Dynamic programming solution for a class of pursuit evasion problems: the herding problem." IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 31(1), 35-41, 2001. [DOI:10.1109/5326.923266]
7. [7] Bayazit, O. Burchan, Jyh-Ming Lien, and Nancy M. Amato. "Better group behaviors using rule-based roadmaps." In Algorithmic Foundations of Robotics V, pp. 95-111. Springer, Berlin, Heidelberg, 2004. [DOI:10.1007/978-3-540-45058-0_7]
8. [8] Lien, Jyh-Ming, O. Burchan Bayazit, Ross T. Sowell, Samuel Rodriguez, and Nancy M. Amato. "Shepherding behaviors." In IEEE International Conference on Robotics and Automation, vol. 4, pp. 4159-4164. IEEE, 2004.
9. [9] Lien, Jyh-Ming, Samuel Rodriguez, Jean-Phillipe Malric, and Nancy M. Amato. "Shepherding behaviors with multiple shepherds." In Proceedings of IEEE International Conference on Robotics and Automation (ICRA 2005), pp. 3402-3407. IEEE, 2005.
10. [10] Lien, Jyh-Ming, and Emlyn Pratt. "Interactive Planning for Shepherd Motion." In AAAI Spring Symposium: Agents that Learn from Human Teachers, pp. 95-102. 2009.
11. [11] Cowling, Peter I., and Christian Gmeinwieser. "AI for Herding Sheep." In Proceedings of the Sixth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE 2010), pages 2-7, 2010.
12. [12] Yadav, Nitin, Chenguang Zhou, Sebastian Sardina, and Ralph Rönnquist. "A BDI agent system for the cow herding domain." Annals of mathematics and artificial intelligence, 59(3-4), 313-333, 2010. [DOI:10.1007/s10472-010-9182-1]
13. [13] Dow, Steven, Anand Kulkarni, Scott Klemmer, and Björn Hartmann. "Shepherding the crowd yields better work." In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pp. 1013-1022. ACM, 2012. [DOI:10.1145/2145204.2145355]
14. [14] Strömbom, Daniel. "Attraction based models of collective motion." PhD dissertation, Uppsala university, Department of Mathematics, 2013.
15. [15] Strömbom, Daniel, Richard P. Mann, Alan M. Wilson, Stephen Hailes, A. Jennifer Morton, David JT Sumpter, and Andrew J. King. "Solving the shepherding problem: heuristics for herding autonomous, interacting agents." Journal of the royal society interface, 11, 2014. [DOI:10.1098/rsif.2014.0719]
16. [16] Licitra, Ryan A., Zachary D. Hutcheson, Emily A. Doucette, and Warren E. Dixon. "Single agent herding of n-agents: A switched systems approach." IFAC-PapersOnLine, 50(1), 14374-14379, 2017. [DOI:10.1016/j.ifacol.2017.08.2020]
17. [17] https://multiagentcontest.org/2008/protocol.pdf, (last access on September 2018)
18. [18] Parker, Lynne E., Balajee Kannan, Xiaoquan Fu, and Yifan Tang. "Heterogeneous mobile sensor net deployment using robot herding and line-of-sight formations." In Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003), vol. 3, pp. 2488-2493. IEEE, 2003.
19. [19] Strumberger, Ivana, Nebojsa Bacanin, Slavisa Tomic, Marko Beko, and Milan Tuba. "Static drone placement by elephant herding optimization algorithm." In 2017 25th Telecommunication Forum (Telfor), pp. 1-4. IEEE, 2017. [DOI:10.1109/TELFOR.2017.8249469]
20. [20] Stathopoulos, Thanos, Lewis Girod, John Heidemann, and Deborah Estrin. "Mote herding for tiered wireless sensor networks.", Technical Report No. 58, Center for Embedded Networked Computing, University of California, Los Angeles, 2005.

Add your comments about this article : Your username or Email:

Send email to the article author

Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2021 CC BY-NC 4.0 | Journal of Control

Designed & Developed by : Yektaweb