Mastering PySpark: Spark RDDs vs DataFrames vs SparkSQL
Last updated on 2018-02-02
About the Book
This book shows how to solve various use cases by using PySpark, Spark Python API that exposes the Spark programming model to Python. It shows how to use Resilient Distributed Datasets (RDDs), DataFrames and SparkSQL to answer the same kind of questions.
The Leanpub 45-day 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms...