Friday, December 11, 2015

Wider ML


Machine Learning papers on Arxiv

Authors and titles for recent submissions
http://arxiv.org/list/stat.ML/recent

Avoiding Wireheading with Value Reinforcement Learning∗ Tom Everitt Marcus Hutter Abstract How can we design good goals for arbitrarily intelligent agents? Reinforcement learning (RL) is a natural approach. Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward – the so-called wireheading problem. In this paper we suggest an alternative to RL called value reinforcement learning (VRL). In VRL, agents use the reward signal to learn a utility function. The VRL setup allows us to remove the incentive to wirehead by placing a constraint on the agent’s actions. The constraint is defined in terms of the agent’s belief distributions, and does not require an explicit specification of which actions constitute wireheading. Keywords AI safety, wireheading, self-delusion, value learning, reinforcement learning, artificial general intelligence
http://arxiv.org/pdf/1605.03143v1.pdf

Wednesday, December 9, 2015

NIPS 2015

Advances in Neural Information Processing Systems 28 (NIPS 2015)

http://media.nips.cc/Conferences/2015/NIPS-2015-Conference-Book.pdf

https://nips.cc/Conferences/2015/Schedule
https://papers.nips.cc/

https://papers.nips.cc/book/advances-in-neural-information-processing-systems-28-2015

Grammar as a Foreign Language
Oriol Vinyals∗ Google vinyals@google.com Lukasz Kaiser∗ Google lukaszkaiser@google.com Terry Koo Google terrykoo@google.com Slav Petrov Google slav@google.com Ilya Sutskever Google ilyasu@google.com Geoffrey Hinton Google geoffhinton@google.com

 Abstract
Syntactic constituency parsing is a fundamental problem in natural language processing and has been the subject of intensive research and engineering for decades. As a result, the most accurate parsers are domain specific, complex, and inefficient. In this paper we show that the domain agnostic attention-enhanced sequence-to-sequence model achieves state-of-the-art results on the most widely used syntactic constituency parsing dataset, when trained on a large synthetic corpus that was annotated using existing parsers. It also matches the performance of standard parsers when trained only on a small human-annotated dataset, which shows that this model is highly data-efficient, in contrast to sequence-to-sequence models without the attention mechanism. Our parser is also fast, processing over a hundred sentences per second with an unoptimized CPU implementation.
https://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf



https://papers.nips.cc/paper/5857-inferring-algorithmic-patterns-with-stack-augmented-recurrent-nets
Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets
 Armand Joulin Facebook AI Research 770 Broadway, New York, USA. ajoulin@fb.com Tomas Mikolov Facebook AI Research 770 Broadway, New York, USA. tmikolov@fb.com
 Abstract 
Despite the recent achievements in machine learning, we are still very far from achieving real artificial intelligence. In this paper, we discuss the limitations of standard deep learning approaches and show that some of these limitations can be overcome by learning how to grow the complexity of a model in a structured way. Specifically, we study the simplest sequence prediction problems that are beyond the scope of what is learnable with standard recurrent networks, algorithmically generated sequences which can only be learned by models which have the capacity to count and to memorize sequences. We show that some basic algorithms can be learned from sequential data using a recurrent network associated with a trainable memory.
https://papers.nips.cc/paper/5857-inferring-algorithmic-patterns-with-stack-augmented-recurrent-nets.pdf