| Date | Topic | References | |
|---|---|---|---|
| 8/29 | intro | ||
| 8/31 | Review: video features, activity analysis | ||
| 9/5 | NO CLASS (labor day) | ||
| 9/7 | Review: deep nets | ||
| Deep learning background | |||
| 9/12 | Aayush Bansal [project proposal due] | Generative Adversarial Networks, Auto-encoding variational bayes | |
| 9/14 | Achal Dave | Beyond Short Snippets, Sports 1M | |
| Motion | |||
| 9/19 | James Supancic | Handcrafted local features are convolutional neural networks | |
| 9/21 | Peiyun Hu | Dynamic image networks for action recognition, Large Displacement Optical Flow, Deep Matching | |
| 9/26 | Rohit Girdhar | Temporal Segment Networks, Long Term Temporal Convolutions | |
| 9/28 | Michael Jaison Gnanasekar | Deep Learning for Detecting Multiple Space-Time Action Tubes in Videos, RCNN for Action Detection | |
| Tracking | |||
| 10/3 | Guest lecture by Peter Carr | reference | |
| 10/5 | Chengyang Li | Sequentially Training Convolutional Networks for Visual Tracking | |
| Bodies | |||
| 10/10 | Guest lecture by Katerina Fragkiadaki | Predictive Models of Billiards, Recurrent Networks for Human Dynamics, Iterative Error Feedback | |
| 10/12 | Jingyan Wang | Structural-RNN: Deep Learning on Spatio-Temporal Graphs | |
| Pedestrians | |||
| 10/17 | Yu Zhang | Learning complexity-aware cascades for deep pedestrian detection | |
| 10/19 | Syed Zahir Bokhari | Inferring 'Dark Matter' and 'Dark Energy' from Videos | |
| 10/24 | Mengxin Li | Convolutional Pose Machines | |
| 10/26 | Haoqi Fan | Action Recognition using Visual Attention | |
| Actions | |||
| 10/31 | NO CLASS | ||
| 11/2 | Guest lecture by Leonid Sigal | LSTMS for Activity Detection | |
| 11/7 | Andy Hou | Deep recurrent q-learning for partially observable MDPs | |
| 11/9 | Guest lecture by Jeff Cohen | Facial Analysis, Depression Analysis | |
| Activities | |||
| 11/14 | Yi Shi | Multi-task Recurrent Neural Network for Immediacy Prediction | |
| 11/16 | Martin Li | MovieQA: Understanding Stories in Movies through Question-Answering, MSR-VTT:A Large Video Description Dataset for Bridging Video and Language | |
| 11/21 | Eric Huang | Animate Vision | |
| 11/23 | NO CLASS (thanksgiving) | ||
| Intentions / goals | |||
| 11/28 | Ching-Haung Chen | Inferring the Why in Images, Predicting Motivations of Actions by Leveraging Text | |
| 11/30 | Olga Russakovsky | Cognitive introduction to intention: Movement, Action, Intention | |
| 12/5 | Guest lecture by LP Morency | Multimodal interactions: regression, action recognition | |
| 12/7 | Project presentations | ||