Jürgen Schmidhuber (Swiss AI Lab IDSIA)
http://www.idsia.ch/~juergen/http://lifeboat.com/ex/bios.juergen.schmidhuber
https://plus.google.com/100849856540000067209/posts
http://en.wikipedia.org/wiki/J%C3%BCrgen_Schmidhuber
https://chessprogramming.wikispaces.com/J%C3%BCrgen+Schmidhuber
https://www.linkedin.com/pub/j%C3%BCrgen-schmidhuber/72/268/392?trk=biz_employee_pub
https://innsbigdata.wordpress.com/2015/02/09/interview-with-juergen-schmidhuber/
Schmidhuber, J. (2015).
Deep Learning in Neural Networks: An Overview. Neural Networks, 61, 85-117.
Abstract
In recent years, deep artificial neural networks (including recurrent ones) have won numerous contests in pattern recognition and machine learning. This historical survey compactly summarizes relevant work, much of it from the previous millennium. Shallow and Deep Learners are distinguished by the depth of theircredit assignment paths, which are chains of possibly learnable, causal links between actions and effects. I review deep supervised learning (also recapitulating the history of backpropagation), unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Keywords
- Deep learning;
- Supervised learning;
- Unsupervised learning;
- Reinforcement learning;
- Evolutionary computation
SF ML meetup on August 11, 2014 in SF at upsight
DEEP LEARNING RESOURCES
gtu tech conf
https://www.youtube.com/watch?v=JSNZA8jVcm4
http://www.idsia.ch/~juergen/videos.html
http://www.kurzweilai.net/deep-learning-jurgen-schmidhuber-1
http://videolectures.net/jurgen_schmidhuber/
http://www.meetup.com/SF-Bayarea-Machine-Learning/events/198947462/
slides:
http://www.idsia.ch/~juergen/deep2014white.pdf
Jürgen Schmidhuber - Deep Learning and Artificial Intelligence
from Sep 12, 2014
https://www.youtube.com/watch?v=fam49iVeCqY
search: how to avoid local min neural network schmidhuber
https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=how+to+avoid+local+min+neural+network+schmidhuber
A tutorial on training recurrent neural
networks, covering BPPT, RTRL, EKF and the
"echo state network" approach
http://sourceforge.net/projects/rnnl/
RNN TUTORIAL
http://www.pdx.edu/sites/www.pdx.edu.sysc/files/Jaeger_TrainingRNNsTutorial.2005.pdf
A GENERAL METHOD FOR MULTI-AGENT REINFORCEMENT
LEARNING IN UNRESTRICTED ENVIRONMENTS
http://deeplearning.net/
http://deeplearning.net/datasets/
Deep Mind
Learning word embeddings efficiently with
noise-contrastive estimation
Andriy Mnih
DeepMind Technologies
andriy@deepmind.com
Koray Kavukcuoglu
DeepMind Technologies
koray@deepmind.com
https://www.cs.toronto.edu/~amnih/papers/wordreps.pdf
Tomas Mikolov
Facebook
https://research.fb.com/people/mikolov-tomas/
Latest Publications
Advances in Pre-Training Distributed Word Representations
Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, Armand Joulin
LREC 2018 - May 7, 2018
Efficient Large-Scale Multi-Modal Classification
Douwe Kiela, Edouard Grave, Armand Joulin, Tomas Mikolov
AAAI 2018 - February 2, 2018
Enriching Word Vectors with Subword Information
Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov -
https://www.youtube.com/watch?v=fam49iVeCqY
search: how to avoid local min neural network schmidhuber
https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=how+to+avoid+local+min+neural+network+schmidhuber
A tutorial on training recurrent neural
networks, covering BPPT, RTRL, EKF and the
"echo state network" approach
Alex Graves
I'm a CIFAR Junior Fellow supervised by Geoffrey Hinton in the Department of Computer Science at the University of Toronto.
email: graves@cs.toronto.edu.
Research Interests
- Recurrent neural networks (especially LSTM) http://people.idsia.ch/~juergen/lstm/
- Supervised sequence labelling (especially speech and handwriting recognition)
- Unsupervised sequence learning
http://www.cs.toronto.edu/~graves/
RNN toolkit
Alex Graves released a toolbox(RNNLIB)http://sourceforge.net/projects/rnnl/
RNN TUTORIAL
http://www.pdx.edu/sites/www.pdx.edu.sysc/files/Jaeger_TrainingRNNsTutorial.2005.pdf
A GENERAL METHOD FOR MULTI-AGENT REINFORCEMENT
LEARNING IN UNRESTRICTED ENVIRONMENTS
http://deeplearning.net/
http://deeplearning.net/datasets/
Deep Mind
Learning word embeddings efficiently with
noise-contrastive estimation
Andriy Mnih
DeepMind Technologies
andriy@deepmind.com
Koray Kavukcuoglu
DeepMind Technologies
koray@deepmind.com
https://www.cs.toronto.edu/~amnih/papers/wordreps.pdf
Tomas Mikolov
https://research.fb.com/people/mikolov-tomas/
Latest Publications
Advances in Pre-Training Distributed Word Representations
Tomas Mikolov, Edouard Grave, Piotr Bojanowski, Christian Puhrsch, Armand Joulin
LREC 2018 - May 7, 2018
Efficient Large-Scale Multi-Modal Classification
Douwe Kiela, Edouard Grave, Armand Joulin, Tomas Mikolov
AAAI 2018 - February 2, 2018
Enriching Word Vectors with Subword Information
Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov -
Andrej Karpathy
Stanford
http://cs.stanford.edu/people/karpathy/
RNNLM and Convolutional NN
Andrej Karpathy blog
http://karpathy.github.io/
The Unreasonable Effectiveness of Recurrent Neural NetworksMay 21, 2015
http://karpathy.github.io/2015/05/21/rnn-effectiveness/
corresponding GITHub code
Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch
https://github.com/karpathy/char-rnn
Deep Visual-Semantic Alignments for Generating Image Descriptions
http://cs.stanford.edu/people/karpathy/deepimagesent/
http://cs.stanford.edu/people/karpathy/deepimagesent/
Thomas Breuel, Volkmar Frinken, Marcus Liwicki
LSTM RNN Tutorial 2013
Building Fast High-Performance Recognition Systems with Recurrent Neural Networks and LSTM
http://lstm.iupr.com/
Resources
For the tutorial slides, please go to the Files section
Recommended mplementations:
RNNLIB - the original C++ library implementing LSTM and many of the ideas about LSTM
JANNlab - Java-based implementation of 1D and BLSTM, no CTC
OCRopus - Python-based implementation of 1D and BLSTM, with CTC (the implementation is in lstm.py; here is an example of using lstm.py).
For other implementations mentioned in the tutorial, please contact us.
LSTM RNN Tutorial 2013
Building Fast High-Performance Recognition Systems with Recurrent Neural Networks and LSTM
http://lstm.iupr.com/
Resources
For the tutorial slides, please go to the Files section
Recommended mplementations:
RNNLIB - the original C++ library implementing LSTM and many of the ideas about LSTM
JANNlab - Java-based implementation of 1D and BLSTM, no CTC
OCRopus - Python-based implementation of 1D and BLSTM, with CTC (the implementation is in lstm.py; here is an example of using lstm.py).
For other implementations mentioned in the tutorial, please contact us.
Geoffrey Hinton
A. Krizhevsky,I. Sutskever,
G. E. Hinton
Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012).
Google Scholar
http://papers.nips.cc/paper/4824-imagenet-classification-w
Ilya Sutskever
RNN LSTM
Sequence to Sequence Learning
with Neural Networks
Parallelization
A C++ implementation of deep LSTM with the configuration from the previous section on a single
GPU processes a speed of approximately 1,700 words per second. This was too slow for our
purposes, so we parallelized our model using an 8-GPU machine. Each layer of the LSTM was
executed on a different GPU and communicated its activations to the next GPU / layer as soon as
they were computed. Our models have 4 layers of LSTMs, each of which resides on a separate
GPU. The remaining 4 GPUs were used to parallelize the softmax, so each GPU was responsible
for multiplying by a 1000 × 20000 matrix. The resulting implementation achieved a speed of 6,300
(both English and French) words per second with a minibatch size of 128. Training took about a ten
days with this implementation.
director of Open AI
http://www.cs.toronto.edu/~ilya/
director of Open AI
http://www.cs.toronto.edu/~ilya/
Yan LeCun
Y. LeCun,
Y. Bengio,
G. Hinton
Deep learning. Nature 521, 436–444 (2015).
CrossRefMedlineGoogle Scholar
Alex Graves
V. Mnih,
K. Kavukcuoglu,
D. Silver,
A. A. Rusu,
J. Veness,
M. G. Bellemare,
A. Graves,
M. Riedmiller,
A. K. Fidjeland,
G. Ostrovski,
S. Petersen,
C. Beattie,
A. Sadik,
I. Antonoglou,
H. King,
D. Kumaran,
D. Wierstra,
S. Legg,
D. Hassabis
Human-level control through deep reinforcement learning. Nature 518, 529–533 (2015).
CrossRefMedlineGoogle Scholar
Nando de Freitas
Feb 26 2015 RNN with LSTM YouTube Oxford CS department lecture
Karol Gregor from Deep Mind - Google
2015
DRAW: A Recurrent Neural Network For Image Generation
http://arxiv.org/pdf/1502.04623v2.pdf
Yoshua Bengio
Video - Deep Learning -- Yoshua Bengio (Part 1)Tsvi Achler MD/PhD
http://reason.cs.uiuc.edu/tsvi/
CV
Tutorial Video
http://reason.cs.uiuc.edu/tsvi/TutorialVideo.html
Email: achler@gmail.com
Technical Video for Optimizing Mind
https://www.youtube.com/watch?v=w4aoQUxqlZg&feature=youtu.be
https://www.youtube.com/watch?v=9LJred8R7DY
Tsvi Achler: What is the brain doing different from machine learning algorithms?
http://www.meetup.com/Cognitive-Computing-Enthusiasts/events/226666265/
Tsvi Achler has a unique background focusing on the neural mechanisms of recognition from a multidisciplinary perspective. He has done extensive work in theory and simulations, human cognitive experiments, animal neurophysiology experiments, and clinical training. He has an applied engineering background, has received bachelor degrees from UC Berkeley in Electrical Engineering, Computer Science and advanced degrees from University of Illinois at Urbana-Champaign in Neuroscience (PhD), Medicine (MD) and worked as a postdoc in Computer Science, and at Los Alamos National Labs, and IBM Research. He now heads his own startup Optimizing Mind whose goal is to provide the next generation of machine learning algorithms. Achler has a unique background focusing on the neural mechanisms of recognition from a multidisciplinary perspective. He has done extensive work in theory and simulations, human cognitive experiments, animal neurophysiology experiments, and clinical training. He has an applied engineering background, has received bachelor degrees from UC Berkeley in Electrical Engineering, Computer Science and advanced degrees from University of Illinois at Urbana-Champaign in Neuroscience (PhD), Medicine (MD) and worked as a postdoc in Computer Science, and at Los Alamos National Labs, and IBM Research. He now heads his own startup Optimizing Mind http://optimizingmind.com/ whose goal is to provide the next generation of machine learning algorithms.
"The origin of phenomena observed in brain studies such as oscillations and a speed-accuracy tradeoff remain unclear. It also remains unclear how the brain can be computationally flexible (quickly learn, modify, and use new patterns as it encounters them from the environment), and recall (reason with or describe recognizable patterns from memory). I study the brain from multidisciplinary perspectives looking for a single, compact network that can display these phenomena and perform flexible recognition.
Virtually all popular models of the brain and algorithms of machine learning remain “feedforward” even though it has been clear since the early days that this may limit flexibility (and is not optimal for recall, symbolic reasoning, or analysis). Feedforward methods use optimized weights to perform recognition. In feedforward networks “uniqueness information” is encoded into weights based on the frequency of occurrence found in the training set. This requires optimizing weights over the whole training set.
Instead, I suggest uniqueness is estimated during recognition, by performing optimization on the current pattern that is being recognized. This is NOT optimization to learn weights, instead optimization to perform recognition. Subsequently, only simple Hebbian-like relational learning is required during learning without any uniqueness information. The weights are no longer “feedforward” but learning is more flexible and can be much faster (>>100x), especially for big data since it does not require elaborate rehearsal. From a phenomenological perspective, the optimization during recognition displays general properties observed in brain and cognitive experiments, predicting, oscillations, initial bursting with unrecognized patterns, and speed-accuracy tradeoff.
I will compare computational and cognitive properties of both approaches and discuss the state of new research initiatives."
Ruslan Salakhutdinov
DEPARTMENT OF COMPUTER SCIENCE AND STATISTIC
http://www.cs.toronto.edu/~rsalakhu/
https://www.sciencemag.org/content/350/6266/1332.full
Science 11 December 2015:
Vol. 350 no. 6266 pp. 1332-1338
DOI: 10.1126/science.aab3050
RESEARCH ARTICLE
Human-level concept learning through probabilistic program induction
Brenden M. Lake1,*,
Ruslan Salakhutdinov2,
Joshua B. Tenenbaum3
1Center for Data Science, New York University, 726 Broadway, New York, NY 10003, USA.
2Department of Computer Science and Department of Statistics, University of Toronto, 6 King's College Road, Toronto, ON M5S 3G4, Canada.
3Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.
↵*Corresponding author. E-mail: brenden{at}nyu.edu
Handwritten characters drawn by a model
Not only do children learn effortlessly, they do so quickly and with a remarkable ability to use what they have learned as the raw material for creating new stuff. Lake et al. describe a computational model that learns in a similar fashion and does so better than current deep learning algorithms. The model classifies, parses, and recreates handwritten characters, and can generate new letters of the alphabet that look “right” as judged by Turing-like tests of the model's output in comparison to what real humans produce.
In the Spring of 2016, I will be moving to the Machine Learning Department at Carnegie Mellon University. I am looking for strong PhD students, please apply to CMU if you are interested in working with me.
I am an assistant professor of Computer Science and Statistics at the University of Toronto. I work in the field of statistical machine learning (See my CV.) I received my PhD in computer science from the University of Toronto in 2009. After spending two post-doctoral years at MIT, I joined the University of Toronto in 2011.
My research interests include Deep Learning, Probabilistic Graphical Models, and Large-scale Optimization.
Prospective students: Please read this to ensure that I read your email.
Recent Research Highlights:
See our recent Deep Learning Tutorial in Montreal:
Part 1:[Slides (pdf)], [Video]
Part 2:[Slides (pdf)], [Video]
See our recent Deep Learning Tutorial at KDD 2014: [Video], [ Slides].
Check out our new website with demos and software.
I was helping to run Thematic Program on Statistical Inference, Learning, and Big Data at the Fields Institute.
I am teaching an advanced Machine Learning course at the Fields Institute. Videos of my lectures will be available online. Also, check out Live Streaming of my course.
Adam Coates (Director, Baidu)
10 Billion Parameter Neural Networks in Your Basement
Papers:
Deep learning with COTS HPC systems
http://stanford.edu/~acoates/papers/CoatesHuvalWangWuNgCatanzaro_icml2013.pdf
Alex (Sandy) Pentland, MIT
http://www.theverge.com/2014/5/6/5661318/the-wizard-alex-pentland-father-of-the-wearable-computer
Prof. Michael Jordan, who is one the authors of Latent Dirichlet Allocation, among others:
August 20, 2014 at 18:30 AM Pacific, Yelp, SF:
http://www.meetup.com/sfmachinelearning/
recording
https://www.youtube.com/watch?v=zdavG9xbVp0&feature=youtu.be
http://www.meetup.com/SF-Bayarea-Machine-Learning/
AMPLabs
TUPAQ
http://www.datasciencecentral.com/profiles/blogs/tupaq-automating-model-search-for-large-scale-machine-learning
Automating Model Search for Large Scale Machine Learning
Evan R. Sparks Computer Science Division UC Berkeley sparks@cs.berkeley.edu Ameet Talwalkar Computer Science Dept. UCLA ameet@cs.ucla.edu Daniel Haas Computer Science Division UC Berkeley dhaas@cs.berkeley.edu Michael J. Franklin Computer Science Division UC Berkeley franklin@cs.berkeley.edu Michael I. Jordan Computer Science Division UC Berkeley jordan@cs.berkeley.edu Tim Kraska Dept. of Computer Science Brown University tim kraska@brown.edu
Abstract
The proliferation of massive datasets combined with the development of sophisticated analytical techniques has enabled a wide variety of novel applications such as improved product recommendations, automatic image tagging, and improved speech-driven interfaces. A major obstacle to supporting these predictive applications is the challenging and expensive process of identifying and training an appropriate predictive model. Recent efforts aiming to automate this process have focused on single node implementations and have assumed that model training itself is a black box, limiting their usefulness for applications driven by large-scale datasets. In this work, we build upon these recent efforts and propose an architecture for automatic machine learning at scale comprised of a cost-based cluster resource allocation estimator, advanced hyperparameter tuning techniques, bandit resource allocation via runtime algorithm introspection, and physical optimization via batching and optimal resource allocation. The result is TUPAQ, a component of the MLbase system that automatically finds and trains models for a user’s predictive application with comparable quality to those found using exhaustive strategies, but an order of magnitude more efficiently than the standard baseline approach. TUPAQ scales to models trained on Terabytes of data across hundreds of machines.
http://www.datascienceassn.org/sites/default/files/Automating%20Model%20Search%20for%20Large%20Scale%20Machine%20Learning.pdf
MLI: An API for Distributed Machine Learning
Evan R. Sparks Ameet Talwalkar Virginia Smith
Jey Kottalam
Xinghao Pan
Joseph Gonzalez Michael J. Franklin Michael I. Jordana Tim Kraska
University of California, Berkeley Brown University
AMPCAMP November 2014
http://ampcamp.berkeley.edu/5/?utm_source=AMP+Camp+Wait+List+and+Abandonded+Registrations&utm_campaign=ae6e8c94fd-AMP_Camp_5_Slides_and_video_12_10_2014&utm_medium=email&utm_term=0_8a10332e0b-ae6e8c94fd-215777105
DEEP LEARNING, PROBABILISTIC PROGRAMMING, PARALLEL LEARNING & MORE
http://www.next.ml/
Jeff Risberg - former Tibco executive, currently startup mentor and investor
spark and mllib training material
http://therisbergfamily.com/
https://github.com/JeffRisberg
Sebastian Thrun
http://robots.stanford.edu/
Prof. C.J. Lin:
"Large-scale linear classification: status and challenges"
2014-10-30
https://www.youtube.com/watch?v=GCIJP0cLSmU&feature=youtu.be
Richard Zemel
Professor
Dept. of Computer Science
University of Toronto
http://www.phoenixhollo.com/en/Zemel_1.html
Jeremy Howard
http://www.enlitic.com/
good website design - that's how it's done.
home page of Jeremy Howard, President and Chief Scientist of Kaggle, founder of FastMail.FM (sold to Opera in May 2010), and a co-founder of The Optimal Decisions Group (sold to ChoicePoint in Feb 2008).
http://jhoward.fastmail.fm.user.fm/
https://www.linkedin.com/profile/view?id=54272
Co-founder
The Optimal Decisions Group June 1999 – August 2008 (9 years 3 months)
I came up with the idea for Optimal Decisions Group (http://www.optimaldecisions.com) and worked with my university friend (and math guru) Bruce Davey to turn it into a business. The idea was to move insurance pricing from the risk-minimization approach used up until that time, to a profit-maximization approach (incorporating price elasticity, competitor prices, multi-period simulations, and so forth). The idea turned out to work really well in practice, and Optimal Decisions built a strong presence in the US, UK, and Australia. After nearly 10 years of constant growth I sold the company to ChoicePoint, Today the product is sold as "LexisNexis Optimal Decisions Toolkit".
New medical startup with
Rebecca Weiss
https://www.linkedin.com/profile/view?id=16135206
We're looking to apply machine learning to medical diagnostics. Deep learning for medical imaging will be a key component.
"We're looking for additional data/product partners, healthcare advisors, and potential recruits with very strong applied numerical computing skills (particularly linear algebra, convex optimization, GPU programming, and computer vision)."
Datascience Journal Meetup Participants
Michael Rinehart
https://www.linkedin.com/profile/view?id=18146056
Principal Scientist at Elastica
Priya Desai
https://www.linkedin.com/profile/view?id=10882596
Data Scientist-Algorithms at Stanford University, School of Medicine
Arno Candel (oxdata)
http://www.slideshare.net/0xdata/deep-learning-through-examples
Bill MacCartney
http://nlp.stanford.edu/~wcmac/
Jure Leskovec
https://cs.stanford.edu/people/jure/pubs/
Sebastian Thrun
http://robots.stanford.edu/
Steve Omohundro
http://steveomohundro.com/
http://possibilityresearch.com/
http://selfawaresystems.com/
James Kobielus, columnist, IBM
http://www.infoworld.com/author/James-Kobielus/
Dan Rice
Cognitive/Machine Learning Scientist - Rice Analytics/SkyRELR.com; Calculus of Thought (Elsevier: Academic Press, 2014)
Top Contributor
https://www.linkedin.com/groups/Machine-learning-When-data-scientists-35222.S.5859267227743711236
Root Cause Faster with Data Analytics - Webinar
Join Gary Brandt, HP Global IT Functional Architect, to learn how HP IT incorporates best operational practices to collect and analyze structured and unstructured data using big data analytics at enterprise scale.
Found in 30 minutes: How HP IT used Operations Analytics for rapid root cause analysis
http://h30499.www3.hp.com/t5/Business-Service-Management-BAC/Found-in-30-minutes-How-HP-IT-used-Operations-Analytics-for/ba-p/6574864#.VC4wzitdV9k
Awesome RNN
http://jiwonkim.org/awesome-rnn/
Code
Theano - Python
Simple IPython tutorial on TheanoDeep Learning Tutorials
RNN for semantic parsing of speech
LSTM network for sentiment analysis
Pylearn2 : Library that wraps a lot of models and training algorithms in deep learning
Blocks : modular framework that enables building neural network models
Keras : Theano-based deep learning library similar to Torch, but in Python
Lasagne : Lightweight library to build and train neural networks in Theano
theano-rnn by Graham Taylor
Passage : Library for text analysis with RNNs
Theano-Lights : Contains many generative models
Caffe - C++ with MATLAB/Python wrappers
LRCN by Jeff DonahueTorch - Lua
char-rnn by Andrej Karpathy : multi-layer RNN/LSTM/GRU for training/sampling from character-level language modelsLSTM by Wojciech Zaremba : Long Short Term Memory Units to train a language model on word level Penn Tree Bank dataset
Oxford by Nando de Freitas : Oxford Computer Science - Machine Learning 2015 Practicals
rnn by Nicholas Leonard : general library for implementing RNN, LSTM, BRNN and BLSTM (highly unit tested).
Etc.
Neon: new deep learning library in Python, with support for RNN/LSTM, and a fast image captioning modelBrainstorm: deep learning library in Python, developed by IDSIA, thereby including various recurrent structures
Chainer : new, flexible deep learning library in Python
CGT(Computational Graph Toolkit) : replicates Theano's API, but with very short compilation time and multithreading
RNNLIB by Alex Graves : C++ based LSTM library
RNNLM by Tomas Mikolov : C++ based simple code
https://github.com/yandex/faster-rnnlm
faster-RNNLM of Yandex : C++ based rnnlm implementation aimed to handle huge datasets
neuraltalk by Andrej Karpathy : numpy-based RNN/LSTM implementation
gist by Andrej Karpathy : raw numpy code that implements an efficient batched LSTM
Recurrentjs by Andrej Karpathy : a beta javascript library for RNN
my search -
RNN LSTM on GITHUB
JAVA
Munich Ph.D. Java 2012-2014
BitBucket
https://bitbucket.org/dmonner/xlbp
http://www.cs.umd.edu/~dmonner/papers/nn2012.pdfhttp://www.overcomplete.net/
XLBP README Derek Monner, http://www.cs.umd.edu/~dmonner XLBP stands for eXtensible Localized Back-Propagation. It is a toolkit for building neural networks for use with the LSTM-g training method, which is a generalized (-g) descendant of LSTM (the Long Short Term Memory) and of error back-propagation methods in general. It can build and train arbitrarily complex networks of neurons that can not only add but multiply inputs and save state across time. For more information about LSTM-g, see the following paper (also available at the project website): D. Monner and J.A. Reggia (2012). A generalized LSTM-like training algorithm for second-order recurrent neural networks. Neural Networks, 25, pp 70-83. Available at http://www.cs.umd.edu/~dmonner/papers/nn2012.pdf XLBP is released under the GNU General Public License, version 3. For more information on your rights and responsibilities under this license, see the file LICENSE. INSTALLATION This XLBP repository doubles as a valid Java project which you can import into the Eclipse IDE. This is the recommended way to compile and run XLBP. XLBP requires Java 6 or above. USAGE For a quick start on using XLBP for the most common applications, see the file "tutorial.pdf" in the top level of the source tree.
old
https://github.com/evolvingstuff/LongShortTermMemory
java implementation of old Alex Graves C++ RNN LSTM toolkit
http://deeplearning4j.org/recurrentnetwork.html
https://github.com/deeplearning4j/dl4j-0.4-examples/blob/master/src/main/java/org/deeplearning4j/examples/rnn/GravesLSTMCharModellingExample.java
CUDA enabled C++
CURRENNT
http://sourceforge.net/projects/currennt/
TORCH LUA
https://github.com/stanfordnlp/treelstm/tree/master/sentiment
Various
https://www.reddit.com/r/MachineLearning/comments/2j7ytz/whats_the_best_library_out_there_for/
Stat212b: Topics Course on Deep Learning
by Joan Bruna, UC Berkeley, Stats Department. Spring 2016.
Topics in Deep Learning
http://joanbruna.github.io/stat212b/
This topics course aims to present the mathematical, statistical and computational challenges of building stable representations for high-dimensional data, such as images, text and data. We will delve into selected topics of Deep Learning, discussing recent models from both supervised and unsupervised learning. Special emphasis will be on convolutional architectures, invariance learning, unsupervised learning and non-convex optimization.
Richard Socher
CS224D Lecture 7 - Introduction to TensorFlow (19th Apr 2016)https://www.youtube.com/watch?v=L8Y2_Cq2X5s&feature=youtu.be
NVIDIA TensorRT
High performance deep learning inference for production deployment
https://developer.nvidia.com/tensorrt
No comments:
Post a Comment