The Leanpub 60 Day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms...

We build scalable AI pipelines with Spark and add Dask where Python-first parallelism fits best. We then instrument everything with OpenTelemetry so distributed systems become observable, measurable, and debuggable.
Bought separately
$89.97
Minimum price
$79.99
$89.99
About the Bundle
This bundle targets teams building data and AI workloads that must scale and stay observable. We use Spark to develop scalable AI pipelines, add Dask for Python-first parallelism where it fits best, and then instrument the whole system with OpenTelemetry practices so we can trace, measure, and troubleshoot distributed behavior with confidence.
About the Books
Dask has revolutionized parallel computing for Python, empowering data scientists to accelerate their workflows. This comprehensive guide unravels the intricacies of Dask to help you harness its capabilities for machine learning and data analysis.
Across 10 chapters, you'll master Dask's fundamentals, architecture, and integration with Python's scientific computing ecosystem. Step-by-step tutorials demonstrate parallel mapping, task scheduling, and leveraging Dask arrays for NumPy workloads. You'll discover how Dask seamlessly scales Pandas, Scikit-Learn, PyTorch, and other libraries for large datasets.
Dedicated chapters explore scaling regression, classification, hyperparameter tuning, feature engineering, and more with clear examples. You'll also learn to tap into the power of GPUs with Dask, RAPIDS, and Google JAX for orders of magnitude speedups.
This book places special emphasis on practical use cases related to scalability and distributed computing. You'll learn Dask patterns for cluster computing, managing resources efficiently, and robust data pipelines. The advanced chapters on DaskML and deep learning showcase how to build scalable models with PyTorch and TensorFlow.
With this book, you'll gain practical skills to:
Packed with hands-on examples and expert insights, this book provides the complete toolkit to harness Dask's capabilities. It will empower Python programmers, data scientists, and machine learning engineers to achieve faster workflows and operationalize parallel computing.
A hands-on, recipe-driven book that puts OpenTelemetry into immediate use. This cookbook is for IT folks like developers, Linux admins, cloud engineers, backend pros, networking experts, and security practitioners. It's for anyone who wants a proven, hands-on way to keep an eye on, trace, and understand modern systems.
This book gives you step-by-step easy solutions to everyday observability challenges, so you can integrate, configure, and operate OpenTelemetry in dynamic environments. Each chapter focuses on solving problems that are directly relevant to production teams. These problems include installing and bootstrapping the Collector on Linux, wiring telemetry pipelines for traces, metrics, logs, and baggage, and integrating with the platforms that organizations trust for analysis and alerting.
Key FeaturesThere's no need to get lost in theoretical jargon because OpenTelemetry Cookbook gets right to the meat and potatoes of implementation. Every recipe gives you a clear problem statement, a step-by-step solution, and practical validation. If you're just starting out with observability or want to level up your skills, this book's got you covered with clear steps to understand distributed, cloud-native, and hybrid systems.
This book builds a solid foundation for strong, easy-to-spot infrastructure and application settings, one step at a time. This book isn't about offering quick fixes or magic solutions. It gives you a full set of tools and techniques that help professionals improve visibility, performance, and reliability in their own technical landscapes.
Table of ContentFor those who want to build controlled, reproducible AI systems entirely within your own infrastructure, this book is the most practical and implementation-focused trainer. Instead of relying on external APIs or cloud-hosted intelligence services, this book clearly demonstrates how Apache Spark can orchestrate data preparation, model training, batch inference, reporting, and LLM acceleration in a disciplined and transparent way.
As the book opens, it swiftly defines private AI, making it clear that external AI calls are not allowed, full ownership of datasets and model assets is imperative, and repeatable runs with traceable outputs are essential. I will use a realistic sample to show you how to build an end-to-end workflow that ingests raw data, normalizes it into a stable schema, trains a baseline classifier, extracts keywords, generates summaries, and produces structured reports. There's no doubt that each step is implemented with clarity and attention to maintainability. You can be sure that logging, manifests, and monitoring are embedded from the start. We implement classic machine learning techniques, vLLM, performance measurement, batch processing patterns, quarantine handling, and structured metrics to make private AI more usable and compete with cloud-based AI.
Beyond experimentation, the book transitions seamlessly into packaging and routine execution. It will teach you to bundle multiple stages into a single command workflow, schedule daily or weekly runs, generate compact run reports, and adapt the architecture to new datasets without redesigning the system. It does not promise instant transformation or one-click AI solutions. Instead, it provides a structured path to building a sustainable private AI backbone using Spark as the orchestration layer.
Key LearningsWithin 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms...
We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.
(Yes, some authors have already earned much more than that on Leanpub.)
In fact, authors have earned over $15 million writing, publishing and selling on Leanpub.
Learn more about writing on Leanpub
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them
You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!
Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.
Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.