Deep Autoregressive Regression
In this work, we demonstrate that a major limitation of regression using a mean-squared error loss is its sensitivity to the scale of its targets. This makes learning settings consisting of several subtasks with differently-scaled targets challenging, and causes algorithms to require task-specific learning rate tuning. A recently-proposed alternative loss function, known as histogram loss, avoids this issue. However, its computational cost grows linearly with the number of buckets in the histogram, which renders prediction with real-valued targets intractable. To address this issue, we propose a novel approach to training deep learning models on real-valued regression targets, autoregressive regression, which learns a high-fidelity distribution by utilizing an autoregressive target decomposition. We demonstrate that this training objective allows us to solve regression tasks involving multiple targets with different scales.
READ FULL TEXT