Dynamic Control of Stochastic Evolution: A Deep Reinforcement Learning Approach to Adaptively Targeting Emergent Drug Resistance
The challenge in controlling stochastic systems in which random events can set the system on catastrophic trajectories is to develop a robust ability to respond to such events without significantly compromising the optimality of the baseline control policy. Drug resistance can emerge from random and variable mutations in targeted cell populations; in the absence of an appropriate dosing policy, emergent resistant subpopulations can proliferate and lead to treatment failure. Dynamic feedback dosage control holds promise in combatting this phenomenon, but cell population evolutionary dynamics are complex, stochastic, and often high-dimensional, posing significant challenges to system control. This paper presents CelluDose, a deep reinforcement learning closed-loop dynamic control prototype for automated precision drug dosing targeting stochastic and heterogeneous cell proliferation. Developing optimal dosing schedules for preventing therapy-induced drug resistance involves a tradeoff between the effective suppression of emergent resistant cell subpopulations and the use of conservative dosages and a preference for first-line drugs. CelluDose is trained on model simulations of cell population evolutionary dynamics that combine a system of stochastic differential equations and the additional occurrence of random perturbing events. Both the single-drug and combination therapy policies obtained in training exhibit a 100 at suppressing simulated heterogeneous harmful cell growth and responding to diverse system fluctuations and perturbations within the alloted time and using conservative dosing. The policies obtained were found to be highly robust to model parameter changes and fluctuations not introduced during training.
READ FULL TEXT