Inference for relative sparsity

by   Samuel J. Weisenthal, et al.

In healthcare, there is much interest in estimating policies, or mappings from covariates to treatment decisions. Recently, there is also interest in constraining these estimated policies to the standard of care, which generated the observed data. A relative sparsity penalty was proposed to derive policies that have sparse, explainable differences from the standard of care, facilitating justification of the new policy. However, the developers of this penalty only considered estimation, not inference. Here, we develop inference for the relative sparsity objective function, because characterizing uncertainty is crucial to applications in medicine. Further, in the relative sparsity work, the authors only considered the single-stage decision case; here, we consider the more general, multi-stage case. Inference is difficult, because the relative sparsity objective depends on the unpenalized value function, which is unstable and has infinite estimands in the binary action case. Further, one must deal with a non-differentiable penalty. To tackle these issues, we nest a weighted Trust Region Policy Optimization function within a relative sparsity objective, implement an adaptive relative sparsity penalty, and propose a sample-splitting framework for post-selection inference. We study the asymptotic behavior of our proposed approaches, perform extensive simulations, and analyze a real, electronic health record dataset.


page 1

page 2

page 3

page 4


Relative Sparsity for Medical Decision Problems

Existing statistical methods can be used to estimate a policy, or a mapp...

Asymptotic Inference for Multi-Stage Stationary Treatment Policy with High Dimensional Features

Dynamic treatment rules or policies are a sequence of decision functions...

Queueing Network Controls via Deep Reinforcement Learning

Novel advanced policy gradient (APG) methods with conservative policy it...

Kinetic Energy Plus Penalty Functions for Sparse Estimation

In this paper we propose and study a family of sparsity-inducing penalty...

Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies

Offline reinforcement learning, wherein one uses off-policy data logged ...

A Generalized Proportionate-Type Normalized Subband Adaptive Filter

We show that a new design criterion, i.e., the least squares on subband ...

Please sign up or login with your details

Forgot password? Click here to reset