Saturday, May 24, 2014

Effective Scala

Marius Eriksen, Twitter Inc.
marius@twitter.com (@marius)

http://twitter.github.io/effectivescala/

http://monkey.org/~marius/




ML SOFTWARE
Machine Learning Software
http://sourceforge.net/directory/science-engineering/ai/machinelearning/os:mac/freshness:recently-updated/

Weka 3: Data Mining Software in Java

http://www.cs.waikato.ac.nz/ml/weka/

Naive Bayes classifier

http://en.wikipedia.org/wiki/Naive_Bayes_classifier




stackoverflow questions tagged 'algorithm'

http://stackoverflow.com/questions/tagged/algorithm
22 LinkedIn Secrets LinkedIn Won't Tell You
 


 

Friday, May 23, 2014

 
Apache DirectMemory is a off-heap cache for the Java Virtual Machine.
Last release 0.2 - 2013-09-17. 

http://directmemory.apache.org/

Thursday, May 22, 2014


Java EE version history

http://en.wikipedia.org/wiki/Java_EE_version_history


Machine Learning


https://www.coursera.org/course/ml

COURSERA




Algorithms: Design and Analysis, Part 1


https://www.coursera.org/course/algo

http://www.geeksforgeeks.org/counting-inversions/
http://stackoverflow.com/questions/337664/counting-inversions-in-an-array

https://class.coursera.org/algo-006/wiki/Syllabus



Algorithms: Design and Analysis, Part 2


https://www.coursera.org/course/algo2



Algorithms, Part I


https://www.coursera.org/course/algs4partI



Algorithms, Part II


https://www.coursera.org/course/algs4partII


Analysis of Algorithms

https://www.coursera.org/course/aofa


Sunday, May 18, 2014

ÜberConf Featured Sessions

June 24-27 2014

Docker for Developers

Matt Stine

Matt Stine
"Docker is an open-source engine that automates the deployment of any application as a lightweight, portable, self-sufficient container that will run virtually anywhere." Docker creates containers that provide running process with:
  • an equal slice of CPU
  • a maximum memory quota
  • its own process ID (PID) namespace
  • its own network interface
  • its own private root filesystem
It does this by leveraging low-level Linux kernel primitives like cgroups and namepaces. The end result is a portable application container that can run anywhere Docker can run, including on VMs, bare-metal servers, OpenStack clusters, public instances, or combinations of the above.
More Info »

Angular Workshop

Raju Gandhi

Raju Gandhi
Angular is a new JavaScript framework from Google. If you are looking into developing rich web applications, Angular is your friend. Angular embraces HTML and CSS, allowing you to extend HTML towards your application, and uses plain JavaScript which makes your code easy to reuse, and test. In this workshop we will start from the ground up, and build our way through a simple application that will let us explore the various constructs, and the familiarize ourselves with some of the new terminology in Angular.
More Info »

Leading Technical Change

Nathaniel Schutta

Nathaniel Schutta
Technology changes, it's a fact of life. And while many developers are attracted to the challenge of change, many organizations do a particularly poor job of adapting. We've all worked on projects with, ahem, less than new technologies even though newer approaches would better serve the business. But how do we convince those holding the purse strings to pony up the cash when things are "working" today? At a personal, how do we keep up with the change in our industry?
More Info »

Scaling Agile Teams

Esther Derby

Esther Derby
Agile methods depend on effective cross-functional teams. We’ve heard many Agile success stories…at the team level. But what happens when a product can’t be delivered by one team? What do you do when the “team” that’s needed to work on a particular product is 20 people? Or 20 teams? One response is to create a coordinating role, decompose work, or add layers of hierarchy. Those solutions introduce overhead and often slow down decision making. There are other options to link teams, and ensure communication and integration across many teams. There are no simple answers. But there are design principles for defining workable arrangements when the product is bigger than a handful of agile teams.
More Info »

From Groovy To Java 8

Dan Woods

Dan Woods
Java 8 sports the latest and greatest features of the JVM platform, and introduces new concepts of asynchronous programming and Lambdas, amongst other syntactic improvements to the language. Many of the language's upcoming features are concepts and constructs that Groovy programmers have been familiar with for years.
More Info »

Java EE 7 Hands On Lab

Arun Gupta

Arun Gupta
The Java EE 7 platform focuses on Boosting Productivity and Embracing HTML5. JAX-RS 2 adds a new Client API to invoke the RESTful endpoints. JMS 2 is undergoing a complete overhaul to align with improvements in the Java language. Long awaited Batch Processing API and Concurrency Utilities are getting added make the platform richer. A new API to build WebSocket driven applications is getting added. JSON parsing and generation is now included in the platform itself. JavaServer Faces will add support for HTML5 forms. There are several other improvements coming in this latest version of the platform. Together these APIs will allow you to be more productive by simplifying enterprise development.
More Info »
ÜberConf includes over 150 sessions; including 10 full-day workshop options and 25 half-day workshops. ÜberConf workshops are hands-on coding sessions. Be prepared!

View Event Details


Adam Gibson
 
Founder at Blix.io,
Article Analytics and News Monitoring
Machine Learning Instructor at Zipfian,

Zipfian Academy
http://www.zipfianacademy.com

Thursday, May 15, 2014

Saturday, May 3, 2014

Thursday, May 1, 2014

Unsupervised Learning and Multinomial Logistic Regression with Apache Spark

Thursday, May 1, 2014
6:30 PM to 9:30 PM
Alpine Data Labs
1550 Bryant Street, San Francisco, CA (map)
http://www.meetup.com/sfmachinelearning/events/176105932/
Recording on
http://www.hakkalabs.co/

http://www.slideshare.net/dbtsai/unsupervised-learning-with-apache-spark
http://www.slideshare.net/dbtsai/2014-0501-mlor


This is the second event in this series talking about Machine Learning with Spark!

It's our pleasure to have two speakers in this event. Sandy Ryza from Cloudera will give a talk about unsupervised learning with Spark. DB Tsai from Alpine Data Labs will talk about multinomial logistic regression with L-BFGS optimizer with Spark.



Part1 - Sandy Ryza:

Unsupervised learning refers to a branch of algorithms that try to find structure in unlabeled data.  Clustering algorithms, for example, try to partition elements of a dataset into related groups.  Dimensionality reduction algorithms search for a simpler representation of a dataset.  Spark's MLLib module contains implementations of several unsupervised learning algorithms that scale to huge datasets.  In this talk, we'll dive into uses and implementations of Spark's K-means clustering and Singular Value Decomposition (SVD).

Part2 - DB Tsai:

Logistic Regression can not only be used for modeling binary outcomes but also multinomial outcome with some extension. In this talk, DB will talk about basic idea of binary logistic regression step by step, and then extend to multinomial one. He will show how easy it's with Spark to parallelize this iterative algorithm by utilizing the in-memory RDD cache to scale horizontally (the numbers of training data.) However, there is mathematical limitation on scaling vertically (the numbers of training features) while many recent applications from document classification and computational linguistics are of this type. He will talk about how to address this problem by L-BFGS optimizer instead of Newton optimizer.


Bio:

Sandy Ryza is an engineer on the data science team at Cloudera.  He is a committer on Apache Hadoop and recently led Cloudera's Apache Spark development.


DB Tsai is a machine learning engineer working at Alpine Data Labs. He is recently working with Spark MLlib team to add support of L-BFGS optimizer and multinomial logistic regression in the upstream. He also led the Apache Spark development at Alpine Data Labs. Before joining Alpine Data labs, he was working on large-scale optimization of optical quantum circuits at Stanford as a PhD student.