Find the Underlying Structure of Big Data by Machine Learning with Spark | QQ群: 452154809


Spark Bin 946℃

最新的 Spark1.5.1 Mathine Learning模块提供了俩个包, 尤其后者是基于Spark Sql中的DataFrame实现的,对于特征分析、图计算有更大的意义。

spark.mllib contains the original API built on top of RDDs. provides higher-level API built on top of DataFrames for constructing ML pipelines.

目前了解到的基于Spark 机器学习项目有以下几个好玩的:

  • Zen

          Zen aims to provide the largest scale and the most efficient machine learning platform on top of Spark, including but not limited to logistic regression, latent dirichilet allocation, factorization machines and deep neural network.

  • Deeplearning4j on Spark ML
           DL4J enhances Spark ML with powerful neural network algorithms for supervised and unsupervised learning. Benefits include easy integration of DL4J with Spark-based datasets, with other Spark ML components (such as feature extractors and learning algorithms), and with Spark SQL.
  • Elephas
           Elephas: Keras Deep Learning on Apache Spark.Elephas brings deep learning with Keras to Apache Spark. Elephas intends to keep the simplicity and usability of Keras, allowing for fast prototyping of distributed models to run on large data sets.

转载请注明:单向街的夏天 » 基于Spark的机器学习开源项目

喜欢 (6)or分享 (0)