Unsupervised Learning of Video Representations using LSTMs
Long Short Term Memory (LSTM) networks to learn representations of video sequences.
DEEP LEARNING (DL) RECURRENT NEURAL NETWORKS (RNN) GENERATIVE COMPUTER VISION
We use multilayer Long Short Term Memory (LSTM) networks to learn representations of video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed length representation. This representation is decoded using single or multiple decoder LSTMs to perform different tasks, such as reconstructing the input sequence, or predicting the future sequence.
http://www.cs.toronto.edu/~nitish/unsupervised_video/
A test set for evaluating sequence prediction/reconstruction
Moving MNIST [782Mb] contains 10,000 sequences each of length 20 showing 2 digits moving in a 64 x 64 frame.
The results in the updated arxiv paper use this test set to report numbers. For future prediction, the metric is cross entropy loss for predicting the last 10 frames for each sequence conditioned on the first 10 frames.
Code
PapersUnsupervised Learning of Video Representations using LSTMs [pdf]
Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov
ICML 2015.
Updated arxiv version with more details -
Unsupervised Learning of Video Representations using LSTMs [arxiv]
Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov
unsup_video_lstm.tar.gz [119Kb]
Recurrent Neural Networks
Humans don’t start their thinking from scratch every second. As you read this essay, you understand each word based on your understanding of previous words. You don’t throw everything away and start thinking from scratch again. Your thoughts have persistence.
Traditional neural networks can’t do this, and it seems like a major shortcoming. For example, imagine you want to classify what kind of event is happening at every point in a movie. It’s unclear how a traditional neural network could use its reasoning about previous events in the film to inform later ones.
Recurrent neural networks address this issue. They are networks with loops in them, allowing information to persist.
Deep Learning
Nando de Freitas
26 February 2015
YouTube Oxford CS lectures
https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu
Deep Learning Lecture 12: Recurrent Neural Nets and LSTMs
Published on Mar 2, 2015
Slides available at: https://www.cs.ox.ac.uk/people/nando....
Course taught in 2015 at the University of Oxford by Nando de Freitas with great help from Brendan Shillingford.
https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/
https://www.youtube.com/watch?v=56TYLaQN4N8
Deep Learning
Alex Graves
5 March 2015
YouTube Oxford CS lectures
https://www.youtube.com/watch?v=-yX1SYeDHbg
Recurrent Batch Normalization
http://arxiv.org/abs/1603.09025
Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville
(Submitted on 30 Mar 2016 (v1), last revised 4 Apr 2016 (this version, v3))
We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition, thereby reducing internal covariate shift between time steps. We evaluate our proposal on various sequential problems such as sequence classification, language modeling and question answering. Our empirical results show that our batch-normalized LSTM consistently leads to faster convergence and improved generalization.
Ross Goodwin
http://rossgoodwin.com/
https://medium.com/@rossgoodwin/3505ae7a17e7
Adventures in Narrated Reality, Part II
Ongoing experiments in writing & machine intelligence
By Ross Goodwin
[DRAFT]
Due to the popularity of Adventures in Narrated Reality, Part I, I’ve decided to continue narrating my research concerning the creative potential of LSTM recurrent neural networks here on Medium. In this installment, I’ll begin by introducing a new short film: Sunspring, an End Cue film, directed by Oscar Sharp and starring Thomas Middleditch, created for the 2016 Sci-Fi London 48 Hour Film Challenge from a screenplay generated with an LSTM trained on science fiction screenplays.
We use multilayer Long Short Term Memory (LSTM) networks to learn representations of video sequences. Our model uses an encoder LSTM to map an input sequence into a fixed length representation. This representation is decoded using single or multiple decoder LSTMs to perform different tasks, such as reconstructing the input sequence, or predicting the future sequence.
http://www.cs.toronto.edu/~nitish/unsupervised_video/
A test set for evaluating sequence prediction/reconstruction
Moving MNIST [782Mb] contains 10,000 sequences each of length 20 showing 2 digits moving in a 64 x 64 frame.
The results in the updated arxiv paper use this test set to report numbers. For future prediction, the metric is cross entropy loss for predicting the last 10 frames for each sequence conditioned on the first 10 frames.
Code
PapersUnsupervised Learning of Video Representations using LSTMs [pdf]
Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov
ICML 2015.
Updated arxiv version with more details -
Unsupervised Learning of Video Representations using LSTMs [arxiv]
Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov
unsup_video_lstm.tar.gz [119Kb]
Understanding LSTM Networks
Posted on August 27, 2015Recurrent Neural Networks
Humans don’t start their thinking from scratch every second. As you read this essay, you understand each word based on your understanding of previous words. You don’t throw everything away and start thinking from scratch again. Your thoughts have persistence.
Traditional neural networks can’t do this, and it seems like a major shortcoming. For example, imagine you want to classify what kind of event is happening at every point in a movie. It’s unclear how a traditional neural network could use its reasoning about previous events in the film to inform later ones.
Recurrent neural networks address this issue. They are networks with loops in them, allowing information to persist.
Deep Learning
Nando de Freitas
26 February 2015
YouTube Oxford CS lectures
https://www.youtube.com/playlist?list=PLE6Wd9FR--EfW8dtjAuPoTuPcqmOV53Fu
Deep Learning Lecture 12: Recurrent Neural Nets and LSTMs
Published on Mar 2, 2015
Slides available at: https://www.cs.ox.ac.uk/people/nando....
Course taught in 2015 at the University of Oxford by Nando de Freitas with great help from Brendan Shillingford.
https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/
https://www.youtube.com/watch?v=56TYLaQN4N8
Deep Learning
Alex Graves
5 March 2015
YouTube Oxford CS lectures
https://www.youtube.com/watch?v=-yX1SYeDHbg
Ilya Sutskever's home page
www.cs.toronto.edu/~ilya/
Ilya Sutskever Research Director of OpenAI. I spent three wonderful years as a Research Scientist at the Google Brain Team. Before that, I was a co-founder of ...
Recurrent Batch Normalization
http://arxiv.org/abs/1603.09025
Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville
(Submitted on 30 Mar 2016 (v1), last revised 4 Apr 2016 (this version, v3))
We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition, thereby reducing internal covariate shift between time steps. We evaluate our proposal on various sequential problems such as sequence classification, language modeling and question answering. Our empirical results show that our batch-normalized LSTM consistently leads to faster convergence and improved generalization.
Ross Goodwin
http://rossgoodwin.com/
https://medium.com/@rossgoodwin/3505ae7a17e7
Adventures in Narrated Reality, Part II
Ongoing experiments in writing & machine intelligence
By Ross Goodwin
[DRAFT]
Due to the popularity of Adventures in Narrated Reality, Part I, I’ve decided to continue narrating my research concerning the creative potential of LSTM recurrent neural networks here on Medium. In this installment, I’ll begin by introducing a new short film: Sunspring, an End Cue film, directed by Oscar Sharp and starring Thomas Middleditch, created for the 2016 Sci-Fi London 48 Hour Film Challenge from a screenplay generated with an LSTM trained on science fiction screenplays.
No comments:
Post a Comment