Scala Programming for Big Data Analytics

Retired

This book is no longer available for sale.

Scala Programming for Big Data Analytics

About the Book

It's an open secret that we are living in the world of Big Data. Organisations are currently experiencing disruptive paradigm shift wherein they are increasingly adopting Big Data technologies to process large volumes of data to derive insights with the goal to orchestrate innovation and to stay competitive. As a result of that, demand of candidates with strong skill-set in these areas is experiencing exponential growth and people with skills in Big Data are among the highest paid ones as well.

In Big Data landscape, Hadoop is the de-facto framework that powers big data platforms with its suite of services and Apache Spark is the leading distributed and in-memory computing engine in Hadoop ecosystem. Apache Spark is being used for a diverse variety of Big Data use-cases like machine learning, ETL, graph analytics to name a few and is experiencing phenomenal growth and adoption in businesses all around the world. And Scala is the lingua-franca of Apache Spark i.e. Not only Apache Spark (and many other frameworks like Apache Kafka) is developed in Scala but it is also the recommended language for Apache Spark development as it provides the best performance and access to all the latest features in Apache Spark API releases. Thus, to develop skill-set in Apache Spark and build your career in this promising domain, there is a critical prerequisite i.e. you need to learn Scala!

Learning Scala has manifold benefits on its own as Scala is one of the hottest JVM based programming languages out there and candidates skilled in Scala are among the highest paid ones.

The challenge with Scala is that there is a steep learning curve. Scala combines advanced constructs from functional programming and object oriented principles and candidates willing to learn Scala become overwhelmed with the complexity and depth of language. On the other hand, specifically for getting started with Apache Spark development, one generally needs to master a subset of key concepts in Scala language. This itself is another issue because there is not a single book/resource out there that covers Scala programming language with the focus of Big Data development. Though there is no shortage of books/tutorials on Scala but they cover concepts with unnecessary depth and breadth which may not be relevant for Big Data development.

And this is exactly the problem that my book "Scala Programming for Big Data Analytics" addresses which has been written with one crisp goal: teach you just enough Scala only relevant for Big Data i.e. Apache Spark development with no fluff! Instead of bogging you down with needless details of irrelevant and complex concepts of Scala programming language features, the book covers only the most important concepts with laser-focus and necessary depth along with highlighting the best practices based on my versatile experience of using this language.

The book is crafted to be fully hands-on. If you'll follow this book, you'll find the impression that I am holding your hand and teaching you the concepts starting from the very basics. Each section of the book is complimented with series of hands-on code examples. The book will start by introducing Scala and will then will naturally progress to the topics including variables (mutable/immutable), data types, functions, collections, flow control, libraries usage and exception handling along with a gentle emphasis on object oriented programming and functional programming concepts wherever necessary coupled with best practices based on my versatile experience of using this language.

This book doesn't cover Apache Spark. Rather, it covers the key Scala programming language concepts necessary to develop mastery in Apache Spark. After this book, you will be able to learn Apache Spark with no hassle or even use Scala alone as its a general purpose language.

You don't need to have any prior programming language experience to use this book and you'll be able to do hands-on practice on your system (Windows/Mac/Linux) without any software cost.

Now you are one step away from learning one of the most in-demand languages i.e. Scala to excel your career in the promising and lucrative domain of Big Data. Get this book now and let's learn Scala!

  • Share this book

  • Categories

    • Scala
    • Data Science
    • Distributed Systems
    • Computers and Programming
    • Computer Science
  • Feedback

    Email the Author(s)

About the Author

Irfan Elahi
Irfan Elahi

Irfan Elahi is a Senior Consultant in Deloitte Australia specialising in Big Data and Machine Learning.

His primary focus lies in using Big Data and Machine Learning to support business growth with multifaceted and strong ties to the Telecommunications, Energy, Retail and Media industry. He has worked on a number of projects in Australia in end-to-end life cycle to design, prototype, develop and deploy production-grade Big Data solutions in Amazon Web Services (AWS) and Azure to support a number of use-cases ranging from enterprise data warehousing, ETL offloading, analytics, batch processing and stream processing while employing leading commercial Hadoop solutions like Cloudera and Hortonworks. He has worked closely with clients’ system and software engineering team in DevOps space to enhance the continuous integration and continuous deployment (CICD) processes and manage Hadoop cluster’s operations and security.

Additionally, Irfan is leading the Data-stream of Deloitte’s ClearLight platform to setup a multi-tiered and multi-tenant Big Data platform in Amazon Web Services based on best practices to facilitate firm’s strategic initiatives like trainings, managed services and prototyping for potential clients.

In addition to his technology competencies, Irfan has recently presented in DataWorks Summit in Sydney in 2017 about in-memory big data technologies and in a number of meetups all around the world. He also remained involved delivering knowledge transfer sessions, training and workshops about Big Data and Machine Learning, both within firm and at clients. He also has launched Udemy courses on Apache Spark for Big Data Analytics and R Programming for Data Science with more than 18,000 students from 145 countries enrolled in them.

Table of Contents

  • Context Setting
  • Chapter 0 - Scala Language
    • Introduction
    • Getting to know Scala
    • Why Learn Scala
    • Scala and Java
    • Interoperability with Java Libraries
    • Verbosity - Scala and Java
    • Scala - A Statically Typed Language
    • Apache Spark and Scala
    • Scala Performance Benefits
    • Learning Apache Spark
  • Chapter 1 - Installing Scala
    • Introduction
    • Checking Scala Installation Status in Your System
    • Verifying Java Development Kit (JDK) Installation Status
    • Installing Scala in Windows
    • Verifying Scala Installation Status
    • Exercise
  • Chapter 2 - Using Scala Shell
    • Introduction
    • Getting help in Scala shell
    • Hello World in Scala REPL
    • Understanding Hello World in Scala REPL Step by Step
    • Real Life Example: Usefulness of Scala REPL’s Data Type Highlighting Feature
    • Paste Mode in Scala REPL
    • Retrieving History in Scala REPL
    • Auto-completion Feature of Scala REPL
    • Exiting from Scala REPL
    • Exercise
  • Chapter 3 - Variables
    • Introduction
    • Immutability of Objects in Scala
    • Defining Variables (Mutable and Immutable) in Scala
    • Why Immutability Is So Emphasized in Scala?
    • Mutability and Type-safety Caveats
    • Specifying Types for Variables and Type Inference
    • Exercise
  • Chapter 4 - Data Types
    • Introduction
    • Exercise - Data Types
    • Boolean Type
    • Exercises - Boolean Type
    • String Type
    • Exercise - String Types
    • Special Types in Scala
    • Type Casting in Scala
    • Exercise - Special Types
  • Chapter 5 - Conditional Statements
    • Introduction
    • Caveats - Using {} after if/else
    • Nested If-Else Statements
    • If Else as Ternary Operator
    • Pattern Matching
    • Exercise
  • Chapter 6 - Code Blocks
    • Introduction
    • Caveats - Code Blocks
    • Code Blocks and If/Else Statements
    • Exercise
  • Chapter 7 - Functions
    • Introduction
    • Why use Functions at all?
    • Intuitive Understanding of Functions
    • Invoking a Function
    • Caveats - Function Definition
    • Functions With Multiple Parameters
    • Positional Parameters
    • Default Value of Parameters in Functions
    • Function with No Arguments aka 0 Parity
    • Single Line functions
    • When To Actually Use Return Statements
    • Passing Function As Arguments
    • Anonymous Functions
  • Chapter 8 - Scala collections
    • Introduction
    • Real Life and Intuitive Examples of Collections
    • Lists
    • Indexing List Elements
    • What Can You Store in Lists?
    • Widely Used Lists Operations
    • Iterating Over List
    • Using Map Function for Iterating Over Lists
    • Getting to Know Functional Programming Concepts
    • Using foreach on Lists
    • Using filter on Lists
    • Reduce Operation on Lists
    • List Equality Check
    • Alternative Ways To Create Lists
    • Exercise - Lists
    • Sets
    • Map Collections
    • Indexing a Map
    • Alternative Ways to Create Map Collections
    • Manipulating Maps
    • Iterating through Maps in Functional Style
    • Tuples
    • Indexing Tuples
    • Iterating Over Tuples
    • Alternative Ways to Create Tuples
    • Mutable Collections
    • Implications Related to Mutable Collections
    • Mutable Maps
    • Nested Collections
  • Chapter 9 - Loops
    • Introduction
    • Types of Loops in Scala
    • Guards in For Loop
    • While Loop
    • Comparison of For and While Loop: Which One Suits Well in What Scenarios?
  • Chapter 10 - Using Classes and Packages
    • Introduction
    • Classes and Objects in Scala
    • Mutating Attribute Values and Caveats
    • Singleton Objects
    • Classes and Packages
    • Importing Packages
    • Exercise
  • Chapter 11 - Exception Handling
    • Introduction
    • Fundamentals of Exception Handling in Scala
    • Implications in Type Inference and Exception Handling
    • Exercise - Exception Handling
  • Chapter 12 - Hello World in Apache Spark
    • Development Environment for Apache Spark Development
    • Instantiating Spark Session and Context Object Using OOP Concepts
    • Using Spark Context Object’s Functions to Create Spark-Native Data Structure (RDD)
    • Using RDD’s Transformations Employing Functional Programming and Scala Collection Concepts
    • Employing Scala Functions Concepts in Spark RDD’s Transformations
  • Conclusion and Beyond

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

80% Royalties. Earn $16 on a $20 book.

We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earnedover $13 millionwriting, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub