Mastering PySpark: Spark RDDs vs DataFrames vs SparkSQL
Last updated on 2018-02-02
About the Book
This book shows how to solve various use cases by using PySpark, Spark Python API that exposes the Spark programming model to Python. It shows how to use Resilient Distributed Datasets (RDDs), DataFrames and SparkSQL to answer the same kind of questions.
The Leanpub 45-day 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers), EPUB (for phones and tablets) and MOBI (for Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.