Max-Margin Early Event Detectors

Can we detect a smile before it finishes? 

Figure 1. How many frames do we need to detect a smile reliably? Can we even detect a smile before it finishes?

Abstract

The need for early detection of temporal events from sequential data arises in a wide spectrum of applications ranging from human-robot interaction to video security. While temporal event detection has been extensively studied, early detection is a relatively unexplored problem. This paper proposes a maximum-margin framework for training temporal event detectors to recognize partial events, enabling early detection. Our method is based on Structured Output SVM, but extends it to accommodate sequential data. Experiments on datasets of varying complexity, for detecting facial expressions, hand gestures, and human activities, demonstrate the benefits of our approach. To the best of our knowledge, this is the first paper in the literature of computer vision that proposes a learning formulation for early event detection.

Overview

Simulating the sequential arrival of training data 

Figure 2. We simulate the sequential arrival of training data and use partial events as positive training examples. We train a single event detector to recognize all partial events, but our method does more than augmenting the set of training examples.

Monotonicity requirement 

Figure 3. Monotonicity requirement—the detection score of a partial event cannot exceed the score of an encompassing partial event. MMED provides a principled mechanism to achieve this monotonicity, which cannot be assured by a naive solution that simply augments the set of training examples.

Results

Detecting disgust 

Detecting Disgust. See video

Detecting fear 

Detecting Fear. See video

Figure 4. From left to right: the onset frame, the frame at which MMED fires, the frame at which SOSVM fires, and the peak frame.

People

Minh Hoai Nguyen and Fernando De la Torre

Publications

  • Max-Margin Early Event Detectors.
    Hoai, M. & De la Torre, F. (2014)
    International Journal of Computer Vision, 107(2), 191–202. Paper BibTex.

  • Max-Margin Early Event Detectors.
    Hoai, M. & De la Torre, F. (2012)
    Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. Paper BibTex.

Code

Download mmed-release-0.1 and follow instruction in ./src/README.html.

Copyright notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.

Disclaimer

This software is provided for research purposes only. Usage for sale or commercial purposes is strictly prohibited. If you want to license the software, please contact innovation@cmu.edu at the technology transfer office at Carnegie Mellon University. This is EXPERIMENTAL software. Use at your own risk. No warranty is implied by this distribution. Please report bugs to the authors.