Saturday, September 13, 2014

MLLIB SPARK CONTRIBUTIONS AND DEPENDENCIES



https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark

https://spark.apache.org/docs/1.1.0/


https://spark.apache.org/docs/1.1.0/mllib-guide.html


https://groups.google.com/forum/#!search/MLI$20spark$20contribution/apache-spark-dev-mirror/SvJmio-L7WM/egWQ7KtDRJwJ

https://github.com/apache/spark/tree/master/mllib/src/main/scala/org/apache/spark/mllib
https://github.com/apache/spark/tree/master/mllib/src
https://github.com/cdgore?tab=repositories
https://github.com/databricks/spark-training

Dependencies

MLlib uses the linear algebra package Breeze, which depends on netlib-java, and jblasnetlib-java and jblas depend on native Fortran routines. You need to install the gfortran runtime library if it is not already present on your nodes. MLlib will throw a linking error if it cannot detect these libraries automatically. Due to license issues, we do not include netlib-java’s native libraries in MLlib’s dependency set under default settings. If no native library is available at runtime, you will see a warning message. To use native libraries from netlib-java, please build Spark with -Pnetlib-lgpl or include com.github.fommil.netlib:all:1.1.2 as a dependency of your project. If you want to use optimized BLAS/LAPACK libraries such as OpenBLAS, please link its shared libraries to /usr/lib/libblas.so.3 and /usr/lib/liblapack.so.3, respectively. BLAS/LAPACK libraries on worker nodes should be built without multithreading.

JBLAS

http://mikiobraun.github.io/jblas/

MLLib Vector
Note: Scala imports scala.collection.immutable.Vector by default, so you have to import org.apache.spark.mllib.linalg.Vector explicitly to use MLlib’s Vector.



No comments:

Post a Comment