最新的 Spark1.5.1 Mathine Learning模块提供了俩个包, 尤其后者是基于Spark Sql中的DataFrame实现的，对于特征分析、图计算有更大的意义。
spark.mllib contains the original API built on top of RDDs.
spark.ml provides higher-level API built on top of DataFrames for constructing ML pipelines.
Zen aims to provide the largest scale and the most efficient machine learning platform on top of Spark, including but not limited to logistic regression, latent dirichilet allocation, factorization machines and deep neural network.
Deeplearning4j on Spark ML
DL4J enhances Spark ML with powerful neural network algorithms for supervised and unsupervised learning. Benefits include easy integration of DL4J with Spark-based datasets, with other Spark ML components (such as feature extractors and learning algorithms), and with Spark SQL.
Elephas: Keras Deep Learning on Apache Spark.Elephas brings deep learning with Keras to Apache Spark. Elephas intends to keep the simplicity and usability of Keras, allowing for fast prototyping of distributed models to run on large data sets.