Real-time suggestions for different stories over different aspect ratios
Abstract
Finding views with good photo composition is a challenging task for machine learning methods. A key difficulty is the lack of well annotated large scale datasets. Most existing datasets only provide a limited number of annotations for good views, while ignoring the comparative nature of view selection. In this work, we present the first large scale Comparative Photo Composition dataset, which contains over one million comparative view pairs annotated using a cost-effective crowdsourcing workflow. We show that these comparative view annotations are essential for training a robust neural network model for composition. In addition, we propose a novel knowledge transfer framework to train a fast view proposal network, which runs at 75+ FPS and achieves state-of-the-art performance in image cropping and thumbnail generation tasks on three benchmark datasets. The superiority of our method is also demonstrated in a user study on a challenging experiment, where our method significantly outperforms the baseline methods in producing diversified well-composed views.
Contributions
Two datasets with dense views comparisons   [See Resources]
A robust view evaluation network
A real-time view proposal network
Sample Results
(a): suggestions that have high overlap rate with ground-truth. (b): suggestions that have low overlap rate with groud-truth.
Good View Hunting: Learning Photo Composition from Dense View Pairs.
Wei, Z., Zhang, Z., Shen, X., Lin, Z., Mech, R., Hoai, M. and Samaras, D. (2018)
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Bibtex
@inproceedings{wei2018good,
Author = {Zijun Wei and Jianming Zhang and Xiaohui Shen and Zhe Lin and Radomir Mech and Minh Hoai and Dimitris Samaras},
Booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition},
Title = {Good View Hunting: Learning Photo Composition from Dense View Pairs},
Year = {2018}
}
Acknowledgement
This project was partially supported by a gift from Adobe, NSF CNS-1718014, the Partner University Fund, and the SUNY2020 Infrastructure Transportation Security Center.