Safely and Quickly Deploying New Features with a Staged Rollout Framework Using Sequential Test and Adaptive Experimental Design

05/25/2019
by   Zhenyu Zhao, et al.
0

During the rapid development cycle for Internet products (websites and mobile apps), new features are developed and rolled out to users constantly. Features with code defects or design flaws can cause outages and significant degradation of user experience. The traditional method of code review and change management can be time-consuming and error-prone. In order to make the feature rollout process safe and fast, this paper proposes a methodology for rolling out features in an automated way using an adaptive experimental design. Under this framework, a feature is gradually ramped up from a small proportion of users to a larger population based on real-time evaluation of the performance of important metrics. If there are any regression detected during the ramp-up step, the ramp-up process stops and the feature developer is alerted. There are two main algorithm components powering this framework: 1) a continuous monitoring algorithm - using a variant of the sequential probability ratio test (SPRT) to monitor the feature performance metrics and alert feature developers when a metric degradation is detected, 2) an automated ramp-up algorithm - deciding when and how to ramp up to the next stage with larger sample size. This paper presents one monitoring algorithm and three ramping up algorithms including time-based, power-based, and risk-based (a Bayesian approach) schedules. These algorithms are evaluated and compared on both simulated data and real data. There are three benefits provided by this framework for feature rollout: 1) for defective features, it can detect the regression early and reduce negative effect, 2) for healthy features, it rolls out the feature quickly, 3) it reduces the need for manual intervention via the automation of the feature rollout process.

READ FULL TEXT

page 1

page 5

research
09/21/2023

Automated Probe Life-Cycle Management for Monitoring-as-a-Service

Cloud services must be continuously monitored to guarantee that misbehav...
research
05/29/2022

Rapid Regression Detection in Software Deployments through Sequential Testing

The practice of continuous deployment has enabled companies to reduce ti...
research
07/12/2022

Refactoring Assertion Roulette and Duplicate Assert test smells: a controlled experiment

Test smells can reduce the developers' ability to interact with the test...
research
06/23/2020

Lumos: A Library for Diagnosing Metric Regressions in Web-Scale Applications

Web-scale applications can ship code on a daily to weekly cadence. These...
research
07/18/2022

Lightweight Automated Feature Monitoring for Data Streams

Monitoring the behavior of automated real-time stream processing systems...
research
10/18/2019

A Deep Learning-based Framework for the Detection of Schools of Herring in Echograms

Tracking the abundance of underwater species is crucial for understandin...
research
08/25/2022

Adaptive Learning for Service Monitoring Data

Service monitoring applications continuously produce data to monitor the...

Please sign up or login with your details

Forgot password? Click here to reset