CSE594: Video Analysis

Fall 2014, Tue Thu 11.30-12.50, Location: TBA
Instructor: Minh Hoai Nguyen, CS 2424, Phone: 631-632-8460
Office hour: Tue 4-5pm, Thu 4-6pm


Automatic video content analysis is crucial for a wide range of applications, from movie categorization, sports illustration to traffic monitoring and security enforcement. Depending on the application, video analysis requires performing one or a number of tasks such as tracking object, recognizing people, understanding behaviors, detecting events, and predicting consequences. These tasks, however, are tremendously challenging for computers, and many applications still come with a string attached: ‘‘potential’’.

The goal of this graduate seminar course is to gain a big picture of video content analysis. We will survey state-of-the-art performance and analyze what can and cannot be achieved at the moment. We will delve into technical details to understand the capabilities and limitations of existing algorithms. The central focus will be on representation and learning of visual data such as people detection and activity recognition, but we will also cover topics such as speech recognition and cinematic video production.

The course will consist of reading and presenting an eclectic mix of classic and recent papers on a range of topics. All students will be required to submit a written summary for each paper. Additionally, there will be a substantial class project during the semester.

This course is suitable for

  • Computer vision students who want to get up to speed in video analysis

  • Machine learning students who are looking for applications of their algorithms

  • Graduate students hunting for a killer application for the next successful start-up.


This course focuses on visual representation and learning. Students should already have basic knowledge about either machine learning or computer vision.

Topics (tentative)

  • Video basics

    • Format, compression, etc

    • Video capturing and production, cinematic techniques, rushes and edited materials

  • Basic computer vision algorithms for video analysis

    • Tracking

    • Optical flow estimation

    • Background subtraction

    • Depth estimation

  • Visual learning and recognition

    • People detection

    • Pose estimation

    • Action and activity recognition

    • Event (e.g., birthday party and wedding) recognition

    • Text spotting

  • Indexing and searching

    • Inverted file index

    • Approximate nearest neighbor search

    • Product quantization

  • Audio

    • Speech recognition

    • Laughter removal

  • Meta data

    • Subtitle, transcript

    • Dynamic time warping, multimodal alignment

  • Visual effects

    • Intelligent video editing

    • Adding objects and visual effects


  • Class participation: 10%

  • Reading and preparation: 10%

  • Class presentation and evaluation: 30%

  • Final project: 50%


There is no required textbook for the course. Readings will be posted with the associated lectures.


Papers and reading list will be posted later.

Academic misconduct policy:

Don’t cheat. Cheating on anything will be dealt with as academic misconduct and handled accordingly. I won’t spend a lot of time trying to decide if you actually cheated. If I think cheating might have occurred, then evidence will be forwarded to the University’s Academic Misconduct Committee and they will decide. If cheating has occurred, an F grade will be awarded. Discussion of assignments and projects is acceptable, but you must do your own work. Near duplicate assignments will be considered cheating unless the assignment was restrictive enough to justify such similarities in independent work. Just think of it that way: Cheating impedes learning and having fun. The labs are meant to give you an opportunity to really understand the class material. Please also note that opportunity makes thieves: It is your responsibility to protect your work and to ensure that it is not turned in by anyone else. No excuses!

Disability note:

If you have a physical, psychological, medical or learning disability that may impact on your ability to carry out assigned course work, I would urge that you contact the staff in the Disabled Student Services office (DSS), Room 133 Humanities, 632-6748/TDD. DSS will review your concerns and determine, with you, what accommodations are necessary and appropriate. All information and documentation of disability is confidential.