[صفحه اصلی ]   [Archive] [ English ]  
:: صفحه اصلي :: درباره نشريه :: آخرين شماره :: تمام شماره‌ها :: جستجو :: ثبت نام :: ارسال مقاله :: تماس با ما ::
:: دوره 12، شماره 2 - ( مجله کنترل، جلد 12، شماره 2، تابستان 1397 ) ::
جلد 12 شماره 2,1397 صفحات 13-25 برگشت به فهرست نسخه ها
حل زیربهینه بازی های گرافی دیفرانسیلی غیر خطی با استفاده از برنامه ریزی پویای تقریبی تک-شبکه
مجید مازوچی1، محمد باقر نقیبی سیستانی* 1، سید کمال حسینی ثانی1
1- دانشگاه فردوسی مشهد
چکیده:   (482 مشاهده)

در ایﻦ ﻣﻘﺎﻟﻪ یﮏ اﻟﮕﻮریﺘﻢ یﺎدﮔﯿﺮی ﺑﺮﺧﻂ ﺑﺮﻣﺒﻨﺎی ﺑﺮﻧﺎﻣﻪ ریﺰی ﭘﻮیﺎی ﺗﻘﺮیﺒﯽ ﺗﮏ-ﺷﺒﮑﻪ ﺑﺮای ﺣﻞ ﺗﻘﺮیﺒﯽ ﺑﺎزی ﻫﺎی ﮔﺮاﻓﯽ دیﻔﺮاﻧﺴﯿﻠﯽ زﻣﺎن ﭘﯿﻮﺳﺘﻪ ﻏﯿﺮﺧﻄﯽ ﺑﺎ ﺗﺎﺑﻊ ﻫﺰیﻨﻪ زﻣﺎن ﻧﺎﻣﺤﺪود و دیﻨﺎﻣﯿﮏ ﻣﻌﯿﻦ ﭘﯿﺸﻨﻬﺎد ﺷﺪه اﺳﺖ. در ﺑﺎزی ﻫﺎی ﮔﺮاﻓﯽ دیﻔﺮاﻧﺴﯿﻠﯽ، ﻫﺪف ﻋﺎﻣﻞ ﻫﺎ ردیﺎﺑﯽ ﺣﺎﻟﺖ رﻫﺒﺮ ﺑﻪ ﺻﻮرت ﺑﻬﯿﻨﻪ ﻣﯽ ﺑﺎﺷﺪ، ﺑﻪ ﻃﻮری ﮐﻪ دیﻨﺎﻣﯿﮏ ﺧﻄﺎ و اﻧﺪیﺲ ﻋﻤﻠﮑﺮد ﻫﺮ ﻋﺎﻣﻞ ﺑﺴﺘﮕﯽ ﺑﻪ ﺗﻮﭘﻮﻟﻮژی ﮔﺮاف ﺗﻌﺎﻣﻠﯽ ﺑﺎزی دارد. در اﻟﮕﻮریﺘﻢ ﭘﯿﺸﻨﻬﺎدی، ﻫﺮ ﻋﺎﻣﻞ ﺗﻨﻬﺎ از یﮏ ﺷﺒﮑﻪ ﻋﺼﺒﯽ ﻧﻘﺎد ﺑﺮای ﺗﻘﺮیﺐ ارزش و ﺳﯿﺎﺳﺖ ﮐﻨﺘﺮﻟﯽ ﺑﻬﯿﻨﻪ ﺧﻮد اﺳﺘﻔﺎده ﻣﯽ ﮐﻨﺪ و از ﻗﻮاﻧﯿﻦ ﺗﻨﻈﯿﻢ وزن ﭘﯿﺸﻨﻬﺎد ﺷﺪه ﺑﺮای ﺑﻪ روزرﺳﺎﻧﯽ ﺑﺮﺧﻂ وزن ﻫﺎی ﺷﺒﮑﻪ ﻋﺼﺒﯽ ﻧﻘﺎد ﺧﻮد ﺑﻬﺮه ﻣﯽ ﺟﻮیﺪ. در ایﻦ ﻣﻘﺎﻟﻪ، ﺑﺎ ﻣﻌﺮﻓﯽ ﺳﻮﺋﯿﭻ ﻫﺎی ﭘﺎیﺪار ﺳﺎز ﻣﺤﻠﯽ در ﻗﻮاﻧﯿﻦ ﺗﻨﻈﯿﻢ وزن ﻫﺎی ﺷﺒﮑﻪ ﻋﺼﺒﯽ ﮐﻪ ﭘﺎیﺪاری ﺳﯿﺴﺘﻢ ﺣﻠﻘﻪ ﺑﺴﺘﻪ و ﻫﻤﮕﺮایﯽ ﺑﻪ ﺳﯿﺎﺳﺖ ﻫﺎی ﺗﻌﺎدل ﻧﺶ را ﺗﻀﻤﯿﻦ ﻣﯽ ﮐﻨﻨﺪ، دیﮕﺮ ﻧﯿﺎزی ﺑﻪ ﻣﺠﻤﻮﻋﻪ ﺳﯿﺎﺳﺖ ﻫﺎی ﮐﻨﺘﺮﻟﯽ ﭘﺎیﺪار ﺳﺎز اوﻟﯿﻪ وﺟﻮد ﻧﺪارد. ﺑﻌﻼوه در ایﻦ ﻣﻘﺎﻟﻪ از ﺗﺌﻮری ﻟﯿﺎﭘﺎﻧﻮف ﺑﺮای اﺛﺒﺎت ﭘﺎیﺪاری ﺳﯿﺴﺘﻢ ﺣﻠﻘﻪ ﺑﺴﺘﻪ اﺳﺘﻔﺎده ﻣﯽ ﺷﻮد. در ﭘﺎیﺎن، ﻣﺜﺎل ﺷﺒﯿﻪ ﺳﺎزی، ﻣﻮﺛﺮ ﺑﻮدن اﻟﮕﻮریﺘﻢ ﭘﯿﺸﻨﻬﺎدی را ﻧﺸﺎن ﻣﯽ دﻫﺪ

واژه‌های کلیدی: برنامه ریزی پویای تقریبی، شبکه های عصبی، کنترل بهینه، یادگیری تقویتی
متن کامل [PDF 457 kb]   (153 دریافت)    
نوع مطالعه: پژوهشي | موضوع مقاله: تخصصي
دریافت: ۱۳۹۵/۳/۲۷ | پذیرش: ۱۳۹۶/۹/۱۹ | انتشار: ۱۳۹۷/۷/۱۱
فهرست منابع
1. Olfati-Saber R. and Murray R. M., 2004, "Consensus problems in networks of agents with switching topology and time-delays," IEEE Transactions on Automatic Control, vol. 49, no. 9, pp. 1520–1533. [DOI:10.1109/TAC.2004.834113]
2. Ren W., Beard R. W. and Atkins E. M., 2005, "A survey of consensus problems in multi-agent coordination," in Proc. of the 2005 IEEE American Control Conference, pp. 1859–1864.
3. Olfati-Saber R., Alex Fax J. and Murray R. M., 2007, "Consensus and cooperation in networked multi-agent systems," in Proc. of the IEEE 2007, vol. 95, no. 1, pp. 215–233. [DOI:10.1109/JPROC.2006.887293]
4. Qu Z., Cooperative Control of Dynamical Systems: Applications to Autonomous Vehicles. New York: Springer-Verlag, 2009.
5. Defoort M., Floquet T., Kokosy A., et al. 2008, "Sliding-mode formation control for cooperative autonomous mobile robots", IEEE Transactions on Industrial Electronics, vol. 55, no. 11, pp. 3944–3953. [DOI:10.1109/TIE.2008.2002717]
6. Lin W., 2014, "Distributed UAV formation control using differential game approach", Aerospace Science and Technology, vol. 35, pp. 54–62. [DOI:10.1016/j.ast.2014.02.004]
7. Beard, R. W. and Stepanyan, V., 2003, "Synchronization of information in distributed multiple vehicle coordination control". In Proc. of the IEEE conference on decision and control, Maui, HI, pp. 2029–2034.
8. Mu S., Chu T. and Wang L., 2005, "Coordinated collective motion in a motile particle group with a leader", Physica A, vol. 351, pp. 211–226. [DOI:10.1016/j.physa.2004.12.054]
9. Nasirian V., Davoudi A., and Lewis F. L., 2014 "Distributed adaptive droop control for DC Microgrids," in Proc. 29th IEEE Applied Power Electronics Conference and Exposition, pp. 1147–1152.
10. Rong L., Xu S. and Zhang B., 2012, "On the general second-order consensus protocol in multi-agent systems with input delays", Transactions of the Institute of Measurement and Control, vol. 34, no. 8, pp. 983–989. [DOI:10.1177/0142331211432950]
11. Xie D. and Chen J., 2013, "Consensus problem of data-sampled networked multi-agent systems with time-varying communication delays", Transactions of the Institute of Measurement and Control, vol. 35, no. 6, pp. 753–763. [DOI:10.1177/0142331212472223]
12. Zhang H., Lewis F. and Qu Z., 2012, "Lyapunov, adaptive, and optimal design techniques for cooperative systems on directed communication graphs", IEEE Transactions on Industrial Electronics, vol. 59, pp. 3026–3041. [DOI:10.1109/TIE.2011.2160140]
13. Ren W., Beard R. and Atkins E., 2007, "Information consensus in multi vehicle cooperative control", IEEE Control Systems, vol. 27, no.2, pp. 71–82. [DOI:10.1109/MCS.2007.338264]
14. Zhuand W. and Cheng D., 2010, "Leader-following consensus of second-order agents with multiple time-varying delays". Automatica 46(12): 1994–1999. [DOI:10.1016/j.automatica.2010.08.003]
15. Ren W., Moore K. and Chen Y., 2007, "High-order and model reference consensus algorithms in cooperative control of multi vehicle systems", Journal of Dynamic Systems, Measurement, and Control, vol. 129, no. 5, pp. 678–688. [DOI:10.1115/1.2764508]
16. Wang X. and Chen G., 2002, "Pinning control of scale-free dynamical networks", Physica A, vol. 310, no. 3–4, pp. 521–531. [DOI:10.1016/S0378-4371(02)00772-0]
17. Hong Y., Hu J. and Gao L., 2006, "Tracking control for multi-agent consensus with an active leader and variable topology", Automatica, vol. 42, no. 7, pp. 1177–1182. [DOI:10.1016/j.automatica.2006.02.013]
18. Li X., Wang X. and Chen G., 2004, "Pinning a complex dynamical network to its equilibrium", IEEE Transactions on Circuits and Systems, vol. 51, no.10, pp. 2074–2087. [DOI:10.1109/TCSI.2004.835655]
19. Tang Z., 2015, "Leader-following consensus with directed switching topologies", Transactions of the Institute of Measurement and Control, vol. 37, no. 3, pp. 406-413. [DOI:10.1177/0142331214540931]
20. Xie D., Yuan D., Lu J., et al., 2013, "Consensus control of second-order leader–follower multi-agent systems with event-triggered strategy", Transactions of the Institute of Measurement and Control, vol. 35, no.4, pp. 426–436. [DOI:10.1177/0142331212454046]
21. Başar, T. and Olsder, G. J., Classics in applied mathematics, Dynamic noncooperative game theory (2nd ed.). Philadelphia: SIAM, 1999.
22. Vamvoudakis, K. G., Lewis, F. L., and Hudas, G. R., 2012, "Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality", Automatica, vol. 48, no. 8, pp. 1598–1611. [DOI:10.1016/j.automatica.2012.05.074]
23. Sutton, R. S. and Barto, A. G., Reinforcement learning—an introduction. Cambridge, MA: MIT Press, 1998.
24. Sen, S. and Weiss, G., Learning in multi-agent systems, in multi-agent systems: a modern approach to distributed artificial intelligence. (pp. 259–298). Cambridge, MA: MIT Press, 1999.
25. Murray J.J., Cox C.J., Lendaris G.G., et al., 2002, "Adaptive dynamic programming", IEEE Transactions on Systems, Man, and Cybernetics, vol. 32, no. 2, pp. 140–153. [DOI:10.1109/TSMCC.2002.801727]
26. Wei, Q., Liu, D., and Lewis F. L., 2015, "Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games," Inform. Sci., vol. 317, pp. 96-113. [DOI:10.1016/j.ins.2015.04.044]
27. Jiao, Q., Modares, H., Xu, S., Lewis, F. L., and Vamvoudakis, K. G., 2016, "Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control," Automatica, vol. 69, pp. 24-34. [DOI:10.1016/j.automatica.2016.02.002]
28. Abouheaf M. I. and Lewis F. L., 2013, "Multi-agent differential graphical games: Nash online adaptive learning solutions", 52nd IEEE Conference on Decision and Control, pp. 5803-5809. [DOI:10.1109/CDC.2013.6760804]
29. Abouheaf M. I., Lewis F. L. and Mahmoud M. S., 2014, "Differential graphical games: Policy iteration solutions and coupled Riccati formulation", European Control Conference, pp.1594-1599.
30. Barto A.G., Sutton R.S. and Anderson C.W., 1983, "Neuronlike adaptive elements that can solve difficult learning control problems", IEEE Transactions on Systems, Man, and Cybernetics, vol. 13, pp. 834–846. [DOI:10.1109/TSMC.1983.6313077]
31. Pao Y.H. and Philips S.M., 1995, "The functional link net learning optimal control", Neurocomputing vol. 9, pp. 149–164. [DOI:10.1016/0925-2312(95)00066-F]
32. Abu-Khalaf M. and Lewis F.L., 2005, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach", Automatica, vol. 41, pp. 779–791. [DOI:10.1016/j.automatica.2004.11.034]
33. Modares, H., Lewis, F. L., and Naghibi-Sistani, M. B., 2014, "Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems," Automatica, vol. 50, no. 1, pp. 193-202. [DOI:10.1016/j.automatica.2013.09.043]
34. Tatari F., Naghibi-Sistani M. B., Vamvoudakis K. G., 2015, "Distributed Learning Algorithm for Nonlinear Differential Graphical Games," in Transactions of the Institute of Measurement and Control, doi: 10.1177/0142331215603791. [DOI:10.1177/0142331215603791]
35. Zhang H., Cui L. and Luo Y., 2013, "Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP", IEEE Transactions on Systems, Man, and Cybernetics, vol. 43, no. 1, pp. 206–216.
36. Dierks, T., and Jagannathan, S., 2010, "Optimal control of affine nonlinear continuous-time systems using an online Hamilton-Jacobi-Isaacs formulation," In: Proceedings of the 49th Decision and Control Conference. Atlanta, GA: IEEE, 3048 – 3053.
37. Lewis F. L., Vrabie D. and Syrmos V. L., Optimal Control. 3rd Edition. John Wiley, 2012. [DOI:10.1002/9781118122631]
38. Abu-Khalaf M., and Lewis F. L., 2005, "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach". Automatica 41: 779–791. [DOI:10.1016/j.automatica.2004.11.034]
39. Finlayson B.A., The Method of Weighted Residuals and Variational Principles. New York: Academic Press, 1990.
40. Hornik K., Stinchcombe M. and White H., 1990, "Universal approximation of an unknown mapping and its derivatives using multi layer feedforward networks", Neural Networks, vol. 3, no. 5, pp. 551–560. [DOI:10.1016/0893-6080(90)90005-6]
41. Khalil H. K., Nonlinear System. Englewood Cliffs, NJ: Prentice-Hall, 1996.
ارسال پیام به نویسنده مسئول

ارسال نظر درباره این مقاله
نام کاربری یا پست الکترونیک شما:

CAPTCHA code


XML   English Abstract   Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Mazouchi M, Naghibi Sistani M B, Hosseini Sani S K. Suboptimal Solution of Nonlinear Graphical Games Using Single Network Approximate Dynamic Programming . JoC. 2018; 12 (2) :13-25
URL: http://joc.kntu.ac.ir/article-1-382-fa.html

مازوچی مجید، نقیبی سیستانی محمد باقر، حسینی ثانی سید کمال. حل زیربهینه بازی های گرافی دیفرانسیلی غیر خطی با استفاده از برنامه ریزی پویای تقریبی تک-شبکه. مجله کنترل. 1397; 12 (2) :13-25

URL: http://joc.kntu.ac.ir/article-1-382-fa.html



دوره 12، شماره 2 - ( مجله کنترل، جلد 12، شماره 2، تابستان 1397 ) برگشت به فهرست نسخه ها
مجله کنترل Journal of Control
Persian site map - English site map - Created in 0.06 seconds with 31 queries by YEKTAWEB 3772