Classification of sparse binary vectors

by   Evgenii Chzhen, et al.

In this work we consider a problem of multi-label classification, where each instance is associated with some binary vector. Our focus is to find a classifier which minimizes false negative discoveries under constraints. Depending on the considered set of constraints we propose plug-in methods and provide non-asymptotic analysis under margin type assumptions. Specifically, we analyze two particular examples of constraints that promote sparse predictions: in the first one, we focus on classifiers with ℓ_0-type constraints and in the second one, we address classifiers with bounded false positive discoveries. Both formulations lead to different Bayes rules and, thus, different plug-in approaches. The first considered scenario is the popular multi-label top-K procedure: a label is predicted to be relevant if its score is among the K largest ones. For this case, we provide an excess risk bound that achieves so called `fast' rates of convergence under a generalization of the margin assumption to this settings. The second scenario differs significantly from the top-K settings, as the constraints are distribution dependent. We demonstrate that in this scenario the almost sure control of false positive discoveries is impossible without extra assumptions. To alleviate this issue we propose a sufficient condition for the consistent estimation and provide non-asymptotic upper-bound.


On Generalizing the C-Bound to the Multiclass and Multi-label Settings

The C-bound, introduced in Lacasse et al., gives a tight upper bound on ...

Acknowledging the Unknown for Multi-label Learning with Single Positive Labels

Due to the difficulty of collecting exhaustive multi-label annotations, ...

Can Domain Knowledge Alleviate Adversarial Attacks in Multi-Label Classifiers?

Adversarial attacks on machine learning-based classifiers, along with de...

Skeptical binary inferences in multi-label problems with sets of probabilities

In this paper, we consider the problem of making distributionally robust...

Multi-label ensemble based on variable pairwise constraint projection

Multi-label classification has attracted an increasing amount of attenti...

Noisy Positive-Unlabeled Learning with Self-Training for Speculative Knowledge Graph Reasoning

This paper studies speculative reasoning task on real-world knowledge gr...

Speeding-up One-vs-All Training for Extreme Classification via Smart Initialization

In this paper we show that a simple, data dependent way of setting the i...

Please sign up or login with your details

Forgot password? Click here to reset