John Canny

Weakly Supervised Multimodal Feature Representation Learning from Video

A diagram of a machine learning model

Learning strong representation of video data is a challenging task involving not only visual, but auditory, linguistic, and temporal data. Learning such representations becomes even more challenging with the added data volume and processing requirements over traditional image-only representation learning. In order to maintain user privacy, and empower highly...

Low-Data Learning for Assistive Video Description

Automatic video captioning aims to train models to generate text descriptions for all segments in a video, however, the most effective approaches require large amounts of manual annotation which is slow and expensive. Unsupervised Learning, Semi Supervised Learning, and Active Learning are methods which are designed to improve the learning efficacy of models in low-data domains. In this project...