Sunday, May 17, 2015

NLP STARTUPS AND RESEARCH DIRECTIONS

TRANSFER LEARNING
10 Exciting Ideas of 2018 in NLP
https://t.co/AtEm5CxzVd
Sebastian Ruder
I'm a PhD student in Natural Language Processing and a research scientist at AYLIEN. I blog about Machine Learning, Deep Learning, and NLP.

#NLP 2018 Unsupervised #MT Pretrained LM Common sense inference #datasets Meta-learning Robust #unsupervised methods Understanding reps Clever auxiliary tasks Combining semi-supervised learning w/transfer learning QA & reasoning w/large docs Inductive bias

NLP Startups Analyzing Twitter and Other Social Media Sources in Real Time

QUID

AUGMENTED INTELLIGENCE

https://www.crunchbase.com/person/bob-goodson#/entity
Sean Gourly
http://seangourley.com/about/
https://www.crunchbase.com/person/sean-gourley#/entity

HUMAN + MACHINE = AUGMENTED HUMAN INTELLIGENCE

Big Data and the Rise of Augmented Intelligence: Sean Gourley at TEDxAuckland
https://www.youtube.com/watch?v=mKZCa_ejbfg&feature=youtu.be


Published on Dec 5, 2012

Dr. Sean Gourley is the founder and CTO of Quid. He is a Physicist by training and has studied the mathematical patterns of war and terrorism. This research has taken him all over the world from the Pentagon, to the United Nations and Iraq. Previously, Sean worked at NASA on self-repairing nano-circuits and is a two-time New Zealand track and field champion. Sean is now based in San Francisco where he is building tools

TEDxNewWallStreet - Sean Gourley - High frequency trading and the new algorithmic ecosystemhttps://www.youtube.com/watch?v=V43a-KxLFcg
Published on Apr 12, 2012
Speaker Bio:
Dr. Sean Gourley is the founder and CTO of Quid. He is a Physicist by training and has studied the mathematical patterns of war and terrorism. He is building tools to augment human intelligence.

Technologies:

webGL
https://get.webgl.org/
python
spark



*********************************************************************************
The Startup That Helps You Analyze Twitter Chatter in Real Time
http://www.wired.com/2015/02/luminoso/
Luminoso
Compass works with Twitter out of the box, but it also comes with an API, or application programming interface, that lets you plug it into other online forums. And according to Havasi, it can train itself to search for relevant information.

With the tool, the company aims to compete with a long list of other text analytics companies, from the Chicago-based Network Insights to Lexalytics and Clarabridge.

Right now, if businesses want to track a certain topic, an actual person must manually enter keywords they want to look for, while Compass can generate relevant keywords on the fly.

Meltwater is a Business Intelligence company of +1000 individuals spread across ~60 offices in ~30 countries with over 26,000 clients. At Meltwater we see ourselves as a Outside Insights company, meaning we seek to deliver similar type of business analytics & insights as traditional CRM dashboards and ERP systems used to, except by leveraging data outside the firewall (social media, news, blogs etc.) we believe the insights can be much more decisive and predictive for our clients business. Part of the challenge with this is of course structuring the unstructured data out there. This is why the Data Science team at Meltwater has the mission to ingest, categorize, label, classify, and a whole range of other enrichments on the content that we crawl in order to index it properly in our big data architecture and make it available for our insights dashboard. We do these enrichments in +15 languages.

The second talk will be by Gregor Stewart of Basis Tech. It will be an example of Basis adaptive tech -- Gregor will complement Babak!

Babak Rasolzadeh is the Director of Data Science at Meltwater and has a team of 24 engineers on this. Prior to Meltwater, Babak was the co-founder of OculusAI, a computer vision start-up in Sweden, that was sold to Meltwater in 2013. He holds a PhD in Computer Vision, from KTH in Sweden, and has worked on things ranging from self-driving cars to humanoid robots and mobile object recognition. He is an advisor for several startups here in US and Sweden.

Gregor Stewart is the VP of Product Management for Basis Technology, a multilingual text analytics company based in Cambridge, MA. Among other things, it delivers adaptable entity extraction and resolution components in Java, for 17 languages. Currently, Gregor has the Basis teams hard at work readying a web API offering. Previously, Mr Stewart was CTO of a storage services company, and a strategy consultant. He has degrees from the University of Oxford and the London School of Economics, as well as a Masters in NLP from Edinburgh University.

RESEARCH DIRECTIONS

HERE'S WHAT WE CAN EXPECT FROM DEEP LEARNING IN 2016 AND BEYOND
By Sophie Curtis on December 29, 2015
https://re-work.co/blog/deep-learning-experts-discuss-the-next-5-years

NLP for Assessing Credibility of Scientific Papers

Assessing Credibility of Weblogs Victoria L. Rubin and Elizabeth D. Liddy* School of Information Studies *Center for Natural Language Processing Syracuse University Syracuse, NY13244-1190, USA {vlrubin, liddy}@syr.edu
http://aaaipress.org/Papers/Symposia/Spring/2006/SS-06-03/SS06-03-038.pdf

excerpts:

 The study will elicit and test credibility assessment factors (Phase I), perform NLP-based blog profiling (Phase II), and contentanalyze blog-readers’ comments for partial profile matching (Phase III).

Credibility is viewed as a perceived quality that is evaluated simultaneously with at least two major components: trustworthiness and expertise.

In this study we will explore how these distinctive features of blogs can be used beneficially for NLP and Machine Learning analysis to allow for automation of blog credibility assessment. Thus, the objectives of this study are: 1) to compile a list of factors that users take into account in credibility assessment of weblog sites; 2) to order these factors in terms of their perceived importance to users, and; 3) to suggest which factors can be accessed and computed with NLP-techniques.

Once the factors that contribute to blog credibility are completed and tested, we can focus specific computational efforts on scanning large amounts of information for bloggerprofiling and automating credibility assessment.

Weblogs: Credibility and Collaboration in an Online World
http://people.ischool.berkeley.edu/~vanhouse/Van%20House%20trust%20workshop.pdf

Journalist versus news consumer: The perceived credibility of machine written news
http://compute-cuj.org/cj-2014/cj2014_session4_paper2.pdf

Credibility assessment and inference for fusion of hard and soft information
http://hrilab.tufts.edu/publications/premaratneetal12ahfe.pdf

Assessing Credibility with Natural language processing
https://books.google.com/books?id=BVlDAAAAQBAJ&pg=PA331&lpg=PA331&dq=Assessing+Credibility+with+Natural+language+processing



Attrasoft Launches New Automatic Image Tagging Service - See more at: http://atdc.org/news-from-our-companies/attrasoft-launches-new-automatic-image-tagging-service/#sthash.TGTP04S1.dpuf

http://atdc.org/news-from-our-companies/attrasoft-launches-new-automatic-image-tagging-service/

Google NLP research
http://research.google.com/pubs/NaturalLanguageProcessing.html

ICLR 2016 Best Papers Awards
http://www.iclr.cc/doku.php?id=iclr2016%3Amain#best_paper_awards

Neural Programmer-Interpreters

http://arxiv.org/abs/1511.06279
Scott Reed, Nando de Freitas
(Submitted on 19 Nov 2015 (v1), last revised 29 Feb 2016 (this version, v4))

We propose the neural programmer-interpreter (NPI): a recurrent and compositional neural network that learns to represent and execute programs. NPI has three learnable components: a task-agnostic recurrent core, a persistent key-value program memory, and domain-specific encoders that enable a single NPI to operate in multiple perceptually diverse environments with distinct affordances. By learning to compose lower-level programs to express higher-level programs, NPI reduces sample complexity and increases generalization ability compared to sequence-to-sequence LSTMs. The program memory allows efficient learning of additional tasks by building on existing programs. NPI can also harness the environment (e.g. a scratch pad with read-write pointers) to cache intermediate results of computation, lessening the long-term memory burden on recurrent hidden units. In this work we train the NPI with fully-supervised execution traces; each program has example sequences of calls to the immediate subprograms conditioned on the input. Rather than training on a huge number of relatively weak labels, NPI learns from a small number of rich examples. We demonstrate the capability of our model to learn several types of compositional programs: addition, sorting, and canonicalizing 3D models. Furthermore, a single NPI learns to execute these programs and all 21 associated subprograms.

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

https://arxiv.org/abs/1703.03864
Tim Salimans, Jonathan Ho, Xi Chen, Ilya Sutskever
(Submitted on 10 Mar 2017)
We explore the use of Evolution Strategies, a class of black box optimization algorithms, as an alternative to popular RL techniques such as Q-learning and Policy Gradients. Experiments on MuJoCo and Atari show that ES is a viable solution strategy that scales extremely well with the number of CPUs available: By using hundreds to thousands of parallel workers, ES can solve 3D humanoid walking in 10 minutes and obtain competitive results on most Atari games after one hour of training time. In addition, we highlight several advantages of ES as a black box optimization technique: it is invariant to action frequency and delayed rewards, tolerant of extremely long horizons, and does not need temporal discounting or value function approximation.












No comments:

Post a Comment