Linear regression with partially mismatched data: local search with theoretical guarantees

06/03/2021
by   Rahul Mazumder, et al.
0

Linear regression is a fundamental modeling tool in statistics and related fields. In this paper, we study an important variant of linear regression in which the predictor-response pairs are partially mismatched. We use an optimization formulation to simultaneously learn the underlying regression coefficients and the permutation corresponding to the mismatches. The combinatorial structure of the problem leads to computational challenges. We propose and study a simple greedy local search algorithm for this optimization problem that enjoys strong theoretical guarantees and appealing computational performance. We prove that under a suitable scaling of the number of mismatched pairs compared to the number of samples and features, and certain assumptions on problem data; our local search algorithm converges to a nearly-optimal solution at a linear rate. In particular, in the noiseless case, our algorithm converges to the global optimal solution with a linear convergence rate. We also propose an approximate local search step that allows us to scale our approach to much larger instances. We conduct numerical experiments to gather further insights into our theoretical results and show promising performance gains compared to existing approaches.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2012

Local optima networks and the performance of iterated local search

Local Optima Networks (LONs) have been recently proposed as an alternati...
research
10/29/2020

A Local Search Framework for Experimental Design

We present a local search framework to design and analyze both combinato...
research
06/29/2020

Optimization Landscape of Tucker Decomposition

Tucker decomposition is a popular technique for many data analysis and m...
research
01/16/2014

Efficient Multi-Start Strategies for Local Search Algorithms

Local search algorithms applied to optimization problems often suffer fr...
research
09/22/2017

EB-GLS: An Improved Guided Local Search Based on the Big Valley Structure

Local search is a basic building block in memetic algorithms. Guided Loc...
research
08/14/2023

CausalLM is not optimal for in-context learning

Recent empirical evidence indicates that transformer based in-context le...
research
07/19/2022

Approximate c-Optimal Experimental Designs with Correlated Observations using Combinatorial Optimisation

We review the use of combinatorial optimisation algorithms to identify a...

Please sign up or login with your details

Forgot password? Click here to reset