Survival stacking: casting survival analysis as a classification problem
While there are many well-developed data science methods for classification and regression, there are relatively few methods for working with right-censored data. Here, we present "survival stacking": a method for casting survival analysis problems as classification problems, thereby allowing the use of general classification methods and software in a survival setting. Inspired by the Cox partial likelihood, survival stacking collects features and outcomes of survival data in a large data frame with a binary outcome. We show that survival stacking with logistic regression is approximately equivalent to the Cox proportional hazards model. We further recommend methods for evaluating model performance in the survival stacked setting, and we illustrate survival stacking on real and simulated data. By reframing survival problems as classification problems, we make it possible for data scientists to use well-known learning algorithms (including random forests, gradient boosting machines and neural networks) in a survival setting, and lower the barrier for flexible survival modeling.
READ FULL TEXT