Monte Carlo Policy Gradient Method for Binary Optimization

07/03/2023
by   Cheng Chen, et al.
0

Binary optimization has a wide range of applications in combinatorial optimization problems such as MaxCut, MIMO detection, and MaxSAT. However, these problems are typically NP-hard due to the binary constraints. We develop a novel probabilistic model to sample the binary solution according to a parameterized policy distribution. Specifically, minimizing the KL divergence between the parameterized policy distribution and the Gibbs distributions of the function value leads to a stochastic optimization problem whose policy gradient can be derived explicitly similar to reinforcement learning. For coherent exploration in discrete spaces, parallel Markov Chain Monte Carlo (MCMC) methods are employed to sample from the policy distribution with diversity and approximate the gradient efficiently. We further develop a filter scheme to replace the original objective function by the one with the local search technique to broaden the horizon of the function landscape. Convergence to stationary points in expectation of the policy gradient method is established based on the concentration inequality for MCMC. Numerical results show that this framework is very promising to provide near-optimal solutions for quite a few binary optimization problems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/17/2019

Off-Policy Policy Gradient with State Distribution Correction

We study the problem of off-policy policy optimization in Markov decisio...
research
02/21/2018

Variational Inference for Policy Gradient

Inspired by the seminal work on Stein Variational Inference and Stein Va...
research
02/04/2019

Is There an Analog of Nesterov Acceleration for MCMC?

We formulate gradient-based Markov chain Monte Carlo (MCMC) sampling as ...
research
07/04/2020

Variational Policy Gradient Method for Reinforcement Learning with General Utilities

In recent years, reinforcement learning (RL) systems with general goals ...
research
11/03/2020

A Study of Policy Gradient on a Class of Exactly Solvable Models

Policy gradient methods are extensively used in reinforcement learning a...
research
01/15/2021

Stochastic Learning Approach to Binary Optimization for Optimal Design of Experiments

We present a novel stochastic approach to binary optimization for optima...
research
11/26/2021

Nonequilibrium Monte Carlo for unfreezing variables in hard combinatorial optimization

Optimizing highly complex cost/energy functions over discrete variables ...

Please sign up or login with your details

Forgot password? Click here to reset