Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic

Mashayekhi, Hoda; Nazari, Mostafa

doi:10.52547/joc.15.4.13

Volume 15, Issue 4 (Journal of Control, V.15, N.4 Winter 2022) JoC 2022, 15(4): 13-23 | Back to browse issues page

‎ 10.52547/joc.15.4.13

‎ 20.1001.1.20088345.1400.15.4.2.3

Mendeley

Zotero

RefWorks

Mashayekhi H, Nazari M. Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic. JoC 2022; 15 (4) :13-23
URL: http://joc.kntu.ac.ir/article-1-760-en.html

Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic

Hoda Mashayekhi¹

, Mostafa Nazari ^*¹

1- Shahrood University of Technology

Abstract: (7333 Views)

In this paper, a model-free reinforcement learning-based controller is designed to extract a treatment protocol because the design of a model-based controller is complex due to the highly nonlinear dynamics of cancer. The Q-learning algorithm is used to develop an optimal controller for cancer chemotherapy drug dosing. In the Q-learning algorithm, each entry of the Q-table is updated using data from states, action, and reward. The action is the chemo-drug dose. The proposed controller is implemented on a four states mathematical model including immune cells, tumor cells, healthy cells, and chemo-drug concentration in the bloodstream. Three different treatment strategies are proposed for three young, old, and pregnant patients considering his/her age. Chemotherapy is used in all cases. In the older patient, immunotherapy is also used for modifying the dynamics of cancer by reinforcing his/her weak immune system. A Mamdani fuzzy inference system is designed to limit the maximum chemo-drug dose by regarding the age of the patients. Simulation results show the effectiveness of the proposed treatment strategy. It is also shown that immunotherapy is necessary for finite duration cancer treatment in patients with a weak immune system. The used strategy is a model-free method which is the main advantage of this method.

Keywords: Cancer, Chemotherapy, control, Reinforcement learning

Full-Text [PDF 778 kb] (2212 Downloads)

Type of Article: Research paper | Subject: General
Received: 2020/05/8 | Accepted: 2020/11/30 | ePublished ahead of print: 2021/04/27 | Published: 2021/12/22

References

1. [1] H. Ritchie and M. Roser, "Causes of Death," Our World in Data, 2020.

2. [2] F. Biemar and M. Foti, "Global progress against cancer-challenges and opportunities," Cancer biology and medicine, vol. 10, no. 4, pp. 183-186, 2013.

3. [3] R. P. Araujo and D. L. S. MCelwain, "History of the study of solid tumour growth: the contribution of mathematical modeling," Bulletin of 2Mathematical Biology, vol. 66, pp. 1039-1091, 2004. [DOI:10.1016/j.bulm.2003.11.002]

4. [4] J.C. Doloff and D. J. Waxman, "Transcriptional profiling provides insights into metronomic cyclophosphamide-activated, innate immune-dependent regression of braintumor xenografts," BMC Cancer, vol. 15, no. 1, p. 375, 2015. [DOI:10.1186/s12885-015-1358-y]

5. [5] L. G. De Pillis and A. E. Radunskaya, "A mathematical tumor model with immune resistance and drug therapy: an optimal control approach," Journal of Theoretical Medicine, vol. 3, no. 2, pp. 79-100, 2001. [DOI:10.1080/10273660108833067]

6. [6] T. Chen, N.F.Kirkby, and R. Jena, "Optimal dosing of cancer chemotherapy using model predictive control and moving horizon state/parameter estimation," Computer Methods Programs Biomedicine, vol. 108, no. 3, pp. 1337-1340, 2012. [DOI:10.1016/j.cmpb.2012.05.011]

7. [7] K.L. Kiran, D. Jayachandran, and S. Lakshminarayanan, "Multi-objective optimization of cancer immuno-chemotherapy," presented at the 13th International Conferenceon Biomedical Engineering, 2009.

8. [8] S.L. Noble, E. Sherer, R.E.Hannemann, D.Ramkrishna, T. Vik, and A.E.Rundell, "Using adaptive model predictive control to customize maintenance therapy chemotherapeutic dosing for childhood a cutely mphoblastic leukemia," Journal of Theoretical Biology, vol. 264, no. 3, pp. 990-1002, 2010. [DOI:10.1016/j.jtbi.2010.01.031]

9. [9] M. Engelhart, D. Lebiedz, and S. Sager, "Optimal control for selected cancer chemotherapy ODE models: a view on the potential of optimal schedules and choice of objective function," Mathematical Biosciences, vol. 229, no. 1, pp. 123-134, 2011. [DOI:10.1016/j.mbs.2010.11.007]

10. [10] A. Ghaffari, M. Nazari, and F. Arab, "Suboptimal mixed vaccine and chemotherapy in finite duration cancer treatment: state-dependent Riccati equation control," Journal of the Brazilian Society of Mechanical Sciences and Engineering, vol. 37, no. 1, pp. 45-56, 2015. [DOI:10.1007/s40430-014-0172-9]

11. [11] N. Babaei and M. Salamci, "Controller design for personalized drug administration in cancer therapy: Successive approximation approach," Optimal Control Applications and Methods, pp. 1-38, 2017. [DOI:10.1002/oca.2372]

12. [12] N. Babaei and M. U. Salamci, "Mixed therapy in cancer treatment for personalized drug administration using model reference adaptive control," European Journal of Control, vol. In press, 2019. [DOI:10.1016/j.ejcon.2019.03.001]

13. [13] K.C. Tan, E.F. Khor, J. Cai, C. Heng, and T. H.Lee, "Automating the drug scheduling of cancer chemotherapy via evolutionary computation," Artifficial Intelligence in Medicine, vol. 25, no. 2, pp. 169-185, 2002. [DOI:10.1016/S0933-3657(02)00014-3]

14. [14] S.-M.Tse, Y.Liang, K.-S.Leung, K.-H.Lee, and T.S.-K.Mok, "A memetic algorithm for multiple-drug cancer chemotherapy schedule optimization " IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 37, no. 1, pp. 84-91, 2007. [DOI:10.1109/TSMCB.2006.883265]

15. [15] D.Vrabie, K.G.Vamvoudakis, and F.L.Lewis, Optimal Adaptive Control and Differential Games by Reinforcement Learning Principle. Lomdon, UK: Institution of Engineering andTechnology, 2013. [DOI:10.1049/PBCE081E]

16. [16] M. Sedighizadeh and A. Rezazadeh, "Adaptive PID controller based on reinforcement learning for wind turbine control," World Academy of Science, Engineering and Technology, vol. 13, pp. 1-23, 2008.

17. [17] P. Abbeel, A. Coates, M. Quigley, and A. Y. Ng, "An application of reinforcement learning to aerobatic helicopter flight," Advances in Neural Information Processing Systems, vol. 19, pp. 1-8, 2007.

18. [18] I. Carlucho, M. De Paula, and G. G. Acosta, "An adaptive deep reinforcement learning approach for MIMO PID control of mobile robots," ISA Transactions, vol. In press, 2020. [DOI:10.1016/j.isatra.2020.02.017]

19. [19] C. Pi, K. Hu, S. Cheng, and I. Wu, "Low-level autonomous control and tracking of quadrotor using reinforcement learning," Control Engineering Practice, vol. 95, 2020. [DOI:10.1016/j.conengprac.2019.104222]

20. [20] W. Koch, R. Mancuso, R. West, and A. Bestavros, "Reinforcement Learning for UAV Attitude Control," ACM Transactions on Cyber-Physical Systems, vol. 22, 2019. [DOI:10.1145/3301273]

21. [211] R. Padmanabhan, N. Meskina, and W. M. Haddad, "Reinforcement learning-based control of drug dosing for cancer chemotherapy treatment," Mathematical Biosciences, vol. 293, pp. 11-20, 2017. [DOI:10.1016/j.mbs.2017.08.004]

22. [22] J. Martin-Guerrero, F. Gomez, E.Soria-Olivas, J. Schmidhuber, M.Climente-Marti, and N.Jemenez-Torres, "A reinforcement learning approach for individualizing erythropoiet in dosages in hemodialysis patients," Expert Systems with Applications, vol. 36, pp. 9737-9742, 2009. [DOI:10.1016/j.eswa.2009.02.041]

23. [23] B.L. Moore, L.D. Pyeatt, V. Kulkarni, P. Panousis, Kevin, and A.G.Doufas, "Reinforcement learning for closed-loop propofol anesthesia : a study in human volunteers," Journal of Machine Learning Research, vol. 15, pp. 655-696, 2014.

24. [24] P. Yazdjerdi, N. Meskin, M. Al-Naemi, A. Al Moustafa, and L. Kovács, "Reinforcement learning-based control of tumor growth under anti-angiogenic therapy," Computer Methods and Programs in Biomedicine, vol. 173, pp. 15-26, 2019. [DOI:10.1016/j.cmpb.2019.03.004]

25. [25] R. Padmanabhana, N. Meskin, and W. M. Haddad, "Optimal adaptive control of drug dosing using integral reinforcement learning," Mathematical Biosciences, vol. 309, pp. 131-142, 2019. [DOI:10.1016/j.mbs.2019.01.012]

26. [26] M. Tejedor, A. Z. Woldaregay, and F. Godtliebsen, "Reinforcement learning application in diabetes blood glucose control: A systematic review," Artificial Intelligence in Medicine, vol. 104, pp. 101-183, 2020. [DOI:10.1016/j.artmed.2020.101836]

27. [27] C. J. C. H. Watkins and P. Dayan, "Q-learning," Machine Learning, vol. 8, no. 3, pp. 279-292, 1992. [DOI:10.1023/A:1022676722315]

28. [28] L. G. De Pillis and A. E. Radunskaya, "The dynamics of an optimally controlled tumor model: a case study," Mathematical and Computer Modeling, vol. 37, pp. 1221-1244, 2003. [DOI:10.1016/S0895-7177(03)00133-X]

29. [29] A. Talkington, C. Dantoin, and R. Durrett, "Ordinary Differential Equation Models for Adoptive Immunotherapy," bulletin of Mathematical Biology, vol. 80, no. 5, pp. 1059-1083, 2018. [DOI:10.1007/s11538-017-0263-8]

30. [30] L. G. De Pillis, W. Gu, and A. E. Radunskaya, "Mixed immunotherapy and chemotherapy of tumors: modeling, applications and biological interpretations," Journal of Theoretical Biology, vol. 238, pp. 841-862, 2006. [DOI:10.1016/j.jtbi.2005.06.037]

31. [31] Y. Batmani and H. Khaloozadeh, "Optimal chemotherapy in cancer treatment: state dependent Riccati equation control and extended Kalman ﬁlter," Optimal Control Applications and Methods, vol. 34, pp. 562-577, 2012. [DOI:10.1002/oca.2039]

32. [32] A. Ghaffari, M. Nazari, M. Khazaee, and B. Bahmaei, "Changing the dynamics of a system by using finite duration inputs: Application to cancer modeling and treatment," Journal of Solid and Fluid Mechanics, vol. 4, no. 1, pp. 79-91, 2014.

33. [33] M. Nazari and A. Ghaffari, "The effect of finite duration inputs on the dynamics of a system: Proposing a new approach for cancer treatment," International Journal of Biomathematics, vol. 8, no. 3, pp. 1-19, 2015.

34. [34] A. Ghaffari, M. Nazari, B. Bahmaie, and B. Ghaffari, "How finite duration inputs are able to change the dynamics of a system:Application to finite duration cancer treatment," presented at the 22nd Annual Conference of Mechanical Engineering, Ahvaz, Iran, 2014.

35. [35] M. Nazari, A. Ghaffari, and F. Arab, "Finite duration treatment of cancer by using vaccine therapy and optimal chemotherapy: state-dependent riccati equation control and extended kalman filter," Journal of Biological Systems, vol. 23, no. 1, 2015. [DOI:10.1142/S0218339015500011]

36. [36] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, MA,: MIT Press, 1998. [DOI:10.1109/TNN.1998.712192]

37. [37] L. P. Kaelbling, M. L. Littman, and A. W. Moore, "Reinforcement learning: A survey," Journal of Artificial Intelligence Research, vol. 4, no. 1, pp. 137-285, 1996. [DOI:10.1613/jair.301]

Send email to the article author

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Designed & Developed by : Yektaweb

Related Websites

Site Keywords