How to Simulate Realistic Survival Data? A Simulation Study to Compare Realistic Simulation Models

08/15/2023
by   Maria Thurow, et al.
0

In statistics, it is important to have realistic data sets available for a particular context to allow an appropriate and objective method comparison. For many use cases, benchmark data sets for method comparison are already available online. However, in most medical applications and especially for clinical trials in oncology, there is a lack of adequate benchmark data sets, as patient data can be sensitive and therefore cannot be published. A potential solution for this are simulation studies. However, it is sometimes not clear, which simulation models are suitable for generating realistic data. A challenge is that potentially unrealistic assumptions have to be made about the distributions. Our approach is to use reconstructed benchmark data sets used as a basis for the simulations, which has the following advantages: the actual properties are known and more realistic data can be simulated. There are several possibilities to simulate realistic data from benchmark data sets. We investigate simulation models based upon kernel density estimation, fitted distributions, case resampling and conditional bootstrapping. In order to make recommendations on which models are best suited for a specific survival setting, we conducted a comparative simulation study. Since it is not possible to provide recommendations for all possible survival settings in a single paper, we focus on providing realistic simulation models for two-armed phase III lung cancer studies. To this end we reconstructed benchmark data sets from recent studies. We used the runtime and different accuracy measures (effect sizes and p-values) as criteria for comparison.

READ FULL TEXT
research
10/24/2022

A comparative study to alternatives to the log-rank test

Studies to compare the survival of two or more groups using time-to-even...
research
08/02/2022

On the role of benchmarking data sets and simulations in method comparison studies

Method comparisons are essential to provide recommendations and guidance...
research
09/05/2022

Statistical Comparisons of Classifiers by Generalized Stochastic Dominance

Although being a question in the very methodological core of machine lea...
research
12/10/2020

Cluster analysis and outlier detection with missing data

A mixture of multivariate contaminated normal (MCN) distributions is a u...
research
02/19/2021

Studentized Permutation Method for Comparing Restricted Mean Survival Times with Small Sample from Randomized Trials

Recent observations, especially in cancer immunotherapy clinical trials ...
research
06/10/2019

Incorporating Open Data into Introductory Courses in Statistics

The 2016 Guidelines for Assessment and Instruction in Statistics Educati...
research
06/22/2022

A proposed simulation technique for population stability testing in credit risk scorecards

Credit risk scorecards are logistic regression models, fitted to large a...

Please sign up or login with your details

Forgot password? Click here to reset