MLLIB SPARK CONTRIBUTIONS AND DEPENDENCIES
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
https://spark.apache.org/docs/1.1.0/
https://spark.apache.org/docs/1.1.0/mllib-guide.html
https://groups.google.com/forum/#!search/MLI$20spark$20contribution/apache-spark-dev-mirror/SvJmio-L7WM/egWQ7KtDRJwJ
https://github.com/apache/spark/tree/master/mllib/src/main/scala/org/apache/spark/mllib
https://github.com/apache/spark/tree/master/mllib/src
https://github.com/cdgore?tab=repositories
https://github.com/databricks/spark-training
Dependencies
MLlib uses the linear algebra package Breeze, which depends on netlib-java, and jblas.
netlib-java
and jblas
depend on native Fortran routines. You need to install the gfortran runtime library if it is not already present on your nodes. MLlib will throw a linking error if it cannot detect these libraries automatically. Due to license issues, we do not include netlib-java
’s native libraries in MLlib’s dependency set under default settings. If no native library is available at runtime, you will see a warning message. To use native libraries from netlib-java
, please build Spark with -Pnetlib-lgpl
or include com.github.fommil.netlib:all:1.1.2
as a dependency of your project. If you want to use optimized BLAS/LAPACK libraries such as OpenBLAS, please link its shared libraries to /usr/lib/libblas.so.3
and /usr/lib/liblapack.so.3
, respectively. BLAS/LAPACK libraries on worker nodes should be built without multithreading.JBLAS
http://mikiobraun.github.io/jblas/MLLib Vector
Note: Scala imports
scala.collection.immutable.Vector
by default, so you have to import org.apache.spark.mllib.linalg.Vector
explicitly to use MLlib’s Vector
.
No comments:
Post a Comment