Stochastic Optimization of Area Under Precision-Recall Curve for Deep Learning with Provable Convergence
Areas under ROC (AUROC) and precision-recall curves (AUPRC) are common metrics for evaluating classification performance for imbalanced problems. Compared with AUROC, AUPRC is a more appropriate metric for highly imbalanced datasets. While direct optimization of AUROC has been studied extensively, optimization of AUPRC has been rarely explored. In this work, we propose a principled technical method to optimize AUPRC for deep learning. Our approach is based on maximizing the averaged precision (AP), which is an unbiased point estimator of AUPRC. We show that the surrogate loss function for AP is highly non-convex and more complicated than that of AUROC. We cast the objective into a sum of dependent compositional functions with inner functions dependent on random variables of the outer level. We propose efficient adaptive and non-adaptive stochastic algorithms with provable convergence guarantee under mild conditions by using recent advances in stochastic compositional optimization. Extensive experimental results on graphs and image datasets demonstrate that our proposed method outperforms prior methods on imbalanced problems. To the best of our knowledge, our work represents the first attempt to optimize AUPRC with provable convergence.
READ FULL TEXT