M. Pawan Kumar 
HOME

SMOOTH LOSS FUNCTIONS FOR DEEP TOPK CLASSIFICATION L. Berrada, A. Zisserman and M. Pawan Kumar In Proceedings of International Conference on Learning Representations (ICLR), 2018 The topk error is a common measure of performance in machine learning and computer vision. In practice, topk classification is typically performed with deep neural networks trained with the crossentropy loss. Theoretical results indeed suggest that crossentropy is an optimal learning objective for such a task in the limit of infinite data. In the context of limited and noisy data however, the use of a loss function that is specifically designed for topk classification can bring significant improvements. Our empirical evidence suggests that the loss function must be smooth and have nonsparse gradients in order to work well with deep neural networks. Consequently, we introduce a family of smoothed loss functions that are suited to topk optimization via deep learning. The widely used crossentropy is a special case of our family. Evaluating our smooth loss functions is computationally challenging: a naı̈ve algorithm would require O( n choose k ) operations, where n is the number of classes. Thanks to a connection to polynomial algebra and a divideandconquer approach, we provide an algorithm with a time complexity of O(kn). Furthermore, we present a novel approximation to obtain fast and stable algorithms on GPUs with single floating point precision. We compare the performance of the crossentropy loss and our marginbased losses in various regimes of noise and data size, for the predominant use case of k = 5. Our investigation reveals that our loss is more robust to noise and overfitting than crossentropy. 