Journal of Control

fa کنترل فیدبک مبتنی بر یادگیری تقویتی رشد تومور با محدودسازی دوز داروی شیمی‌درمانی با استفاده از منطق فازی Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic عمومى General پژوهشي Research paper <div style="text-align: justify;">در این مقاله از یک روش کنترلی غیروابسته به مدل برای ارائه پروتکل درمانی استفاده شده است؛ چراکه استفاده از روش های وابسته به مدل به دلیل ماهیت به شدت غیرخطی دینامیک سرطان و وجود عدم قطعیت های فراوان با مشکلاتی مانند تضمین پایداری و سختی در طراحی روبرو هستند. در این مقاله، برای تعیین و بهینه‌سازی میزان دوز دارو، از روش کنترل حلقه بسته برمبنای یادگیری تقویتی استفاده شده است. برای ارائه کنترل کننده بهینه از روش یادگیری Q استفاده شده است. در این روش یادگیری، هر مدخل جدول Q نشان‌دهنده میزان مطلوب بودن یک عمل انتخابی یا همان دوز داروی شیمی‌درمانی نسبت به یک حالت بیمار می‌باشد. این جدول با استفاده از اطلاعات دریافت شده از حالت سیستم، عمل و پاداش، به روز می‌شود. برای نشان دادن موثر بودن روش کنترلی از یک مدل ریاضی که دارای چهار متغیر حالت سلول های ایمنی، سلول های سرطانی، سلول های سالم و غلظت داروی شیمی درمانی در خون است، استفاده شده است. سه بیمار جوان، پیر و باردار با شرایط متفاوت و پارامترهای متفاوت درنظر گرفته شده اند، و برای محدود کردن دوز داروی شیمی درمانی بر مبنای سن بیمار از یک سیستم فازی استفاده شده است. در بیمار پیر به دلیل ضعف سیستم ایمنی علاوه بر شیمی درمانی از ایمنی درمانی هم استفاده شده است که منجر به تقویت ماندگار سیستم ایمنی می شود. نتایج شبیه سازی بر روی سه بیمار با شرایط متفاوت، نشان دهنده موثر بودن الگوریتم کنترلی بهینه ارائه شده در درمان سرطان و قابل اعمال بودن آن برای بیماران با شرایط مختلف است. در تمامی بیماران، سرطان در زمان محدودی درمان و دارودهی نیز قطع شده است. همچنین نشان داده شده است که ایمنی درمانی در بیماران دارای سیستم ایمنی ضعیف، جهت درمان زمان محدود ضروری می باشد.</div> <div style="text-align: justify;"><span style="font-size:9pt"><span style="unicode-bidi:embed"><span style="font-family:Calibri,sans-serif"><span style="font-size:10.0pt"><span style="font-family:"Times New Roman",serif">In this paper, a model-free reinforcement learning-based controller is designed to extract a treatment protocol because the design of a model-based controller is complex due to the highly nonlinear dynamics of cancer. The Q-learning algorithm is used to develop an optimal controller for cancer chemotherapy drug dosing. In the Q-learning algorithm, each entry of the Q-table is updated using data from states, action, and reward. The action is the chemo-drug dose. The proposed controller is implemented on a four states mathematical model including immune cells, tumor cells, healthy cells, and chemo-drug concentration in the bloodstream. Three different treatment strategies are proposed for three young, old, and pregnant patients considering his/her age. Chemotherapy is used in all cases.  In the older patient, immunotherapy is also used for modifying the dynamics of cancer by reinforcing his/her weak immune system. A Mamdani fuzzy inference system is designed to limit the maximum chemo-drug dose by regarding the age of the patients. Simulation results show the effectiveness of the proposed treatment strategy. It is also shown that immunotherapy is necessary for finite duration cancer treatment in patients with a weak immune system. The used strategy is a model-free method which is the main advantage of this method.  </span></span></span></span></span></div> <div style="text-align: justify;"></div> سرطان, شیمی‌درمانی, ایمنی‌درمانی, کنترل, یادگیری تقویتی Cancer, Chemotherapy, control, Reinforcement learning 13 23 http://joc.kntu.ac.ir/browse.php?a_code=A-10-1177-1&slc_lang=fa&sid=1 Hoda Mashayekhi هدی مشایخی hmashayekhi@shahroodut.ac.ir 10031947532846008611 10031947532846008611 No Shahrood University of Technology دانشکده مهندسی کامپیوتر ،دانشگاه صنعتی شاهرود،شاهرود، ایران Mostafa Nazari مصطفی نظری nazari_mostafa@shahroodut.ac.ir 10031947532846008612 10031947532846008612 Yes Shahrood University of Technology دانشکده مهندسی مکانیک و مکاترونیک، دانشگاه صنعتی شاهرود، شاهرود، ایران