دوره 17، شماره 4 - ( مجله کنترل، جلد 17، شماره 4، زمستان 1402 )                   جلد 17 شماره 4,1402 صفحات 87-75 | برگشت به فهرست نسخه ها

XML English Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Manoochehri Rahbar S N, Pariz N, Ramezani-al M R, Heydari A. An online policy iteration for adaptive optimal control of unknown bilinear systems. JoC 2024; 17 (4) :75-87
URL: http://joc.kntu.ac.ir/article-1-1004-fa.html
منوچهری رهبر سیده نفیسه، پریز ناصر، رمضانی آل محمد رضا، حیدری عقیله. کنترل بهینه تطبیقی برخط سیستم‌های دوخطی زمان پیوسته با دینامیک ناشناخته. مجله کنترل. 1402; 17 (4) :75-87

URL: http://joc.kntu.ac.ir/article-1-1004-fa.html


1- گروه ریاضی ، دانشگاه پیام نور، ص.پ. 19395-4697، تهران ، ایران.
2- گروه مهندسی برق، دانشکده فنی و مهندسی، دانشگاه فردوسی مشهد،مشهد، ایران
3- گروه مهندسی برق، دانشکده مهندسی برق و کامپیوتر، دانشگاه صنعتی قوچان،قوچان، ایران
چکیده:   (1804 مشاهده)
طراحی کنترل‌کننده‌ی بهینه برای سیستم‌های دوخطی زمان پیوسته با معلوم بودن دینامیک سیستم طبق اصل بهینگی بلمن پیچیدگی محاسباتی بالایی دارد و عموماً از روش‌های تقریبی وابسته به دانستن دینامیک سیستم برای طراحی کنترل‌کننده استفاده می شود.‌ هنگامی‌که دینامیک سیستم نامعلوم است این مسئله بسیار پیچیده‌تر می‌شود. اولین چیزی که برای حل این مشکل به  نظر می‌رسد شناسایی سیستم دوخطی به کمک روش‌های شناسایی سیستم است. همان‌طور که می‌دانیم روش‌های شناسایی مدلی خطی شده بر اساس داده‌های ورودی  و خروجی سیستم در اختیار طراح قرار می‌دهد تا به سراغ طراحی کنترل‌کننده برود. در این مقاله  با استفاده از رویه‌ای برخط و تطبیقی، یک روش تکراری جدید به‌منظور طراحی کنترل‌کننده بهینه برای یک سیستم دوخطی که دینامیک آن نامعلوم است پیشنهاد می‌گردد. در روش تکرای پیشنهادی و به صورتی تطبیقی، به‌جای دانستن دینامیک سیستم دوخطی با استفاده از اطلاعات برخط ورودی و اندازه‌گیری حالت‌ها، کنترل‌کننده‌ی بهینه طراحی می‌گردد. همچنین با اعمال نویز به‌منزله ورودی به سیستم در یک بازه‌ی زمانی خاص، نیاز به‌ اندازه‌گیری مجدد حالت‌ها برای تکرارهای بعدی برطرف می‌گردد. همگرایی روش تکراری تطبیقی به کنترل‌کننده بهینه به‌صورت قضیه ارائه و اثبات شده است.  
متن کامل [PDF 796 kb]   (227 دریافت)    
نوع مطالعه: پژوهشي | موضوع مقاله: تخصصي
دریافت: 1402/7/16 | پذیرش: 1402/11/12 | انتشار الکترونیک پیش از انتشار نهایی: 1402/11/25 | انتشار: 1402/12/1

فهرست منابع
1. [ ] M. Ven, "Input-to-State Stability for bilinear systems," MS thesis. University of Twente, 2020.
2. [2] R.R Mohler, And A.Y. Khapalov, "Bilinear control and application to flexible ac transmission systems. Journal of Optimization Theory and Applications," 105, pp. 621-637, 2000. [DOI:10.1023/A:1004645224313]
3. [3] D. Williamson, "Observation of bilinear systems with application to biological control," Automatica, 13(3), pp. 243-254, 1977. [DOI:10.1016/0005-1098(77)90051-6]
4. [4] O. Balatif, I. Abdelbaki, M. Rachik, and Z. Rachik, "Optimal control for multi-input bilinear systems with an application in cancer chemotherapy," International Journal of Scientific and Innovative Mathematical Research (IJSIMR), 3(2), pp. 22-31, 2015.
5. [5] D. Gao, Q. Yang, M. Wang and Y. Yu, "Feedback linearization optimal control approach for bilinear systems in CSTR chemical reactor," Intelligent Control and Automation, 3(03), p. 274, 2012. [DOI:10.4236/ica.2012.33031]
6. [6] M.V. Basin and M.A.A. García, "Optimal filtering for bilinear system states and its application to terpolymerization process identification'. Applied Mathematics E-Notes, 4, pp. 7-15, 2004.
7. [7] T. Naik, "Uncertainty propagation in bilinear and polynomial system for probabilistic threshold detection," Master Thesis, Delf University of Technology, 2021.
8. [8] P.M.S. Burt and J.H. de Morais Goulart, "Efficient computation of bilinear approximations and volterra models of nonlinear systems," IEEE Transactions on Signal Processing, 66(3), pp. 804-816, 2017. [DOI:10.1109/TSP.2017.2777391]
9. [9] F.L. Lewis, D.L. Vrabie, and V.L. Syrmos, "Reinforcement learning and optimal adaptive control. Optimal Control," Third Edition, John Wiley & Sons, Inc., Hoboken, NJ, USA, 2012. [DOI:10.1002/9781118122631]
10. [10] D.E. Kirk, Optimal control theory: An introduction. Courier Corporation, 2004.
11. [11] W.A. Cebuhar, and V. Costanza, "Approximation procedures for the optimal control of bilinear and nonlinear systems," Journal of Optimization Theory and Applications, 43, pp. 615-627,1984. [DOI:10.1007/BF00935009]
12. [12] Z. Aganovic, and Z. Gajic, "Successive approximation procedure for steady-state optimal control of bilinear systems," Journal of optimization theory and applications, 84, pp. 273-291. 1995. [DOI:10.1007/BF02192115]
13. [13] M. Ekman, Modeling and control of bilinear systems: application to the activated sludge process. Diss. Acta Universitatis Upsaliensis, 2005.
14. [14] H.Wang, M. Zhu, W. Hong, C. Wang, W. Li, G.Tao, and Y. Wang, "Network-wide traffic signal control using bilinear system modeling and adaptive optimization," IEEE Transactions on Intelligent Transportation Systems, 24(1), pp.79-91, 2022. [DOI:10.1109/TITS.2022.3215537]
15. [15] S. Bichiou, M.K. Bouafoura, and N. Benhadj Braiek, "Time optimal control laws for bilinear systems," Mathematical Problems in Engineering, 2018. [DOI:10.1155/2018/5217427]
16. [16] D. Gao, Q. Yang, M. Wang, and Y. Yu, "Feedback linearization optimal control approach for bilinear systems in CSTR chemical reactor," Intelligent Control and Automation, 3(03), pp. 274-277, 2012. [DOI:10.4236/ica.2012.33031]
17. [17] X. Yang, H. He, D. Liu, and Y. Zhu, "Adaptive dynamic programming for robust neural control of unknown continuous‐time non‐linear systems," IET Control Theory & Applications, 11(14), pp. 2307-2316, 2017. [DOI:10.1049/iet-cta.2017.0154]
18. [18] Y. Wen, J. Si, A. Brandt, X. Gao, and H.H. Huang, "Online reinforcement learning control for the personalization of a robotic knee prosthesis," IEEE Transactions on Cybernetics, 50(6), pp. 2346-2356, 2019. [DOI:10.1109/TCYB.2019.2890974]
19. [19] T. Tan, F. Bao, Y. Deng, A. Jin, Q. Dai, and J. Wang, Cooperative deep reinforcement learning for large-scale traffic grid signal control. IEEE transactions on cybernetics, 50(6), pp. 2687-2700, 2019. [DOI:10.1109/TCYB.2019.2904742]
20. [20] J.J. Murray, C.J. Cox, G.G. Lendaris, and R. Saeks, "Adaptive dynamic programming," IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 32(2), pp. 140-153, 2002. [DOI:10.1109/TSMCC.2002.801727]
21. [21] D. Vrabie, "Online adaptive optimal control for continuous-time systems", 2010.
22. [22] D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F.L. Lewis, "Adaptive optimal control for continuous-time linear systems based on policy iteration," Automatica, 45(2), pp. 477-484, 2009. [DOI:10.1016/j.automatica.2008.08.017]
23. [23] L.B. Prasad, H.O. Gupta, and B. Tyagi, "Application of policy iteration technique based adaptive optimal control design for automatic voltage regulator of power system," International Journal of Electrical Power & Energy Systems, 63, pp. 940-949, 2014. [DOI:10.1016/j.ijepes.2014.06.057]
24. [24] D. Vrabie and F.L. Lewis, "Adaptive optimal control algorithm for continuous-time nonlinear systems based on policy iteration," In 2008 47th IEEE Conference on Decision and Control, pp. 73-79, IEEE, 2008. [DOI:10.1109/CDC.2008.4738955]
25. [25] S. He, H. Fang, M. Zhang, F. Liu, and Z. Ding, "Adaptive optimal control for a class of nonlinear systems: The online policy iteration approach," IEEE transactions on neural networks and learning systems, 31(2), pp. 549-558, 2019. [DOI:10.1109/TNNLS.2019.2905715]
26. [26] Y. Jiang, and Z.P. Jiang, "Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics," Automatica, 48(10), pp. 2699-2704, 2012. [DOI:10.1016/j.automatica.2012.06.096]
27. [27] K. Zhang, and S.L. Ge, "Adaptive optimal control with guaranteed convergence rate for continuous-time linear systems with completely unknown dynamics," IEEE Access, 7, pp. 11526-11532, 2019. [DOI:10.1109/ACCESS.2019.2892427]
28. [28] Z. Shi, and Z. Wang, "Adaptive output-feedback optimal control for continuous-time linear systems based on adaptive dynamic programming approach" Neurocomputing, 438, pp. 334-344, 2021. [DOI:10.1016/j.neucom.2021.01.070]
29. [29] M. Gan, J. and C. Zhang, "Extended adaptive optimal Zhao, control of linear systems with unknown dynamics using adaptive dynamic programming," Asian Journal of Control, 23(2), pp. 1097-1106, 2021. [DOI:10.1002/asjc.2243]
30. [30] Q. Wei, L. Zhu , R. Song, P. Zhang, D. Liu, and J. Xiao, "Model-free adaptive optimal control for unknown nonlinear multiplayer nonzero-sum game," IEEE Transactions on Neural Networks and Learning Systems, 33(2), pp. 879-892, 2020. [DOI:10.1109/TNNLS.2020.3030127]
31. [31] D. Xu, Q.Wang and Y. Li, "Adaptive optimal control approach to robust tracking of uncertain linear systems based on policy iteration," Measurement and Control, 54(5-6), pp. 668-680, 2021. [DOI:10.1177/00202940211007177]
32. [32] J. Zhang, H. Zhang, Z.Liu, and Y. Wang, "Model-free optimal controller design for continuous-time nonlinear systems by adaptive dynamic programming based on a pre-compensator," ISA Transactions, 57, pp. 63-70, 2015. [DOI:10.1016/j.isatra.2014.08.018]
33. [33] Z.Yuan, and J. Cortés. "Data-driven optimal control of bilinear systems," IEEE Control Systems Letters, 6, pp. 2479-2484, 2022. [DOI:10.1109/LCSYS.2022.3164983]
34. [34] B. Iben Warrad, M.K. Bouafoura and N.Benhadj Braiek, "Combined constrained robust least squares approach and block-pulse functions technique for tracking control synthesis of uncertain bilinear systems with multiple time-delayed states under bounded input control," Mathematical Problems in Engineering, 2020, pp. 1-28, 2020. [DOI:10.1155/2020/7186928]
35. [35] D. Goswami, and D.A. Paley, "Bilinearization, reachability, and optimal control of control-affine nonlinear systems: A Koopman spectral approach," IEEE Transactions on Automatic Control, 67(6), pp. 2715-2728, 2021. [DOI:10.1109/TAC.2021.3088802]
36. [36] B. Luo, and H.N. Wu, "Online adaptive optimal control for bilinear systems," In 2012 American Control Conference (ACC), pp. 5507-5512, IEEE, June 2012.
37. [37] R. Longchamp, "Controller design for bilinear systems," IEEE Transactions on Automatic Control, 25(3), pp. 547-548.1980. [DOI:10.1109/TAC.1980.1102382]
38. [38] I. Derese, and E.Noldus, "Design of linear feedback laws for bilinear systems," International Journal of Control, 31(2), pp. 219-237. 1980. [DOI:10.1080/00207178008961039]
39. [39] A. Benallou, D.A Mellichamp, and D.E. Seborg, "Optimal stabilizing controllers for bilinear systems," International Journal of Control, 48(4), pp. 1487-1501, 1988. [DOI:10.1080/00207178808906264]
40. [40] J. Brewer, "Kronecker products and matrix calculus in system theory," IEEE Transactions on Circuits and Systems, 25(9), pp. 772-781. 1978. [DOI:10.1109/TCS.1978.1084534]
41. [41] D. Kleinman, "On an iterative technique for Riccati equation computations," IEEE Transactions on Automatic Control, 13(1), pp. 114-115. 1968 [DOI:10.1109/TAC.1968.1098829]
42. [42] A. Al-Tamimi, F.L. Lewis and M. Abu-Khalaf, "Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control," Automatica, 43(3), pp. 473-481. 2007. [DOI:10.1016/j.automatica.2006.09.019]
43. [43] X. Feng, and Z. Zhang, "The rank of a random matrix," Applied mathematics and computation, 185(1), pp. 689-694. 2007. [DOI:10.1016/j.amc.2006.07.076]
44. [44] H. Modares, F.L. Lewis, and M.B.N. Sistani, "Online solution of nonquadratic two‐player zero‐sum games arising in the H∞ control of constrained input systems," International Journal of Adaptive Control and Signal Processing, 28(3-5), pp. 232-254. 2014. [DOI:10.1002/acs.2348]

ارسال نظر درباره این مقاله : نام کاربری یا پست الکترونیک شما:
CAPTCHA

ارسال پیام به نویسنده مسئول


بازنشر اطلاعات
Creative Commons License این مقاله تحت شرایط Creative Commons Attribution-NonCommercial 4.0 International License قابل بازنشر است.

کلیه حقوق این وب سایت متعلق به مجله کنترل می باشد.

طراحی و برنامه نویسی : یکتاوب افزار شرق

© 2024 CC BY-NC 4.0 | Journal of Control

Designed & Developed by : Yektaweb