Squared Earth Mover's Distance Loss
For Training Deep Neural Networks on Ordered-Classes

Le Hou1 , Chen-Ping Yu2 , Dimitris Samaras1
1 Stony Brook University
2 Phiar Technologies, Inc

Abstract

In the context of multi-class single-label classification, the loss function of deep learning methods compares the predicted class distribution versus the ground truth class distribution. The commonly used cross-entropy loss ignores the intricate inter-class relationships that often exist in real-life tasks such as age classification. We propose to leverage these relationships between classes by training deep nets with the exact squared Earth Mover's Distance (also known as Wasserstein distance), assuming that the classes are ordered: one can put all classes in a one-dimensional space such that the dissimilarities between classes are represented by the euclidean distances between them. The EMD2 loss uses the predicted probabilities of all classes and penalizes the miss-predictions according to the dissimilarities between classes. Our exact EMD2 loss yields state-of-the-art results with limited computational overhead on age estimation and image aesthetics datasets.
p1

Code

Download

Acknowledgements

This work is supported by a gift from Adobe and the Partner University Fund 4DVision project.
If you have any question, please send email to leDOThouATstonybrookDOTedu.