Presentation + Paper
30 May 2022 A data-centric reinforcement learning approach for self-updating machine learning models
Author Affiliations +
Continuously Updating Reinforcement Learning (CURL) demonstrates the ability to rapidly maintain deployed ML models when there is a change in use case such as a denied target with minimal performance effects. Traditional Machine Learning (ML) lifecycle requires ML models to be retrained and redeployed in order to maintain performance of deployed models experiencing changes in underlying data such as data drift. Data drift can include a wide variety of changes in data such as the addition of a new class, operating in an entirely new environment, mislabeled data, or subtle changes in targets over time. (CURL) deviates from this traditional lifecycle with dynamic updates using Reinforcement Learning (RL) to identify and capture data changes, and then automatically retrain the model with data changes. CURL learns to identify changes in data through its RL policy that is designed to maximize the reward for identifying changes in data. Specifically, CURL’s RL approach includes an environment with both the model’s performance and current prediction confidence as the observation space for the agent to act on the discrete action space, and reward function of the model’s accuracy subtracted by the labeling cost to learn data changes. Our controlled experiment demonstrated that the same distribution of denied target data (3%) was found by our RL policy, and our retrained model exceeded the initial classifier performance. CURL can be considered a general purpose technology that could be applied to a wide spectrum of fielded ML systems.
Conference Presentation
© (2022) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Mandy Sack "A data-centric reinforcement learning approach for self-updating machine learning models", Proc. SPIE 12119, Open Architecture/Open Business Model Net-Centric Systems and Defense Transformation 2022, 121190H (30 May 2022);
Get copyright permission
Data modeling

Performance modeling

Process modeling

Machine learning

Systems modeling

Image classification

Optimization (mathematics)

Back to Top