Start / Data Skeptic / Building the howto100m video corpus

Building the howto100m Video Corpus

23 min • 19 augusti 2019

Video annotation is an expensive and time-consuming process. As a consequence, the available video datasets are useful but small. The availability of machine transcribed explainer videos offers a unique opportunity to rapidly develop a useful, if dirty, corpus of videos that are "self annotating", as hosts explain the actions they are taking on the screen.

This episode is a discussion of the HowTo100m dataset - a project which has assembled a video corpus of 136M video clips with captions covering 23k activities.

Related Links

The paper will be presented at ICCV 2019

Antoine on Github

Antoine's homepage

Senaste avsnitt

Network of Past Guests Collaborations

21 juli | 34 min

The Network Diversion Problem

6 juli | 46 min

Complex Dynamic in Networks

28 juni | 56 min

Github Network Analysis

22 juni | 37 min

Networks and Complexity

14 juni | 18 min

Podcastbild

00:00 -00:00

Podcastbild

00:00 -00:00