Kick off your book project in 2 hours! Live workshop on Zoom. You’ll leave with a real book project, progress on your first chapter, and a clear plan to keep going. Tuesday, June 16, 2026. Learn more…
RAG, Agent Bricks, the Multi-Agent Supervisor with MCP, Lakebase, MLflow 3, Lakehouse Monitoring, Feature Store, Vector Search. Every AI surface Databricks shipped at GA in 2025 and 2026, taught by a practitioner, current to 2026. What you will learn - Build RAG pipelines with Vector Search, embedding models, and citation grounding- Ship Agent Bricks for classification and information extraction- Orchestrate specialist agents with the Multi-Agent Supervisor and MCP- Use Lakebase as the operational Postgres layer for AI apps and agents- Detect data and model drift with Lakehouse Monitoring; wire alerts to retraining- Manage the ML lifecycle with MLflow 3 and the UC Model Registry- Govern features across training and serving with Feature Store (offline + online)- Serve foundation and custom models with AI Gateway controls Who this book is for Data engineers, ML engineers, and AI/ML architects who know PySpark and the Databricks platform and now need to ship production AI. Volume 3 is the recommended prerequisite. Table of Contents 1. Databricks SQL in Production. Warehouses, materialized views, three latency signals (admission, compilation, execution), the full dashboard backend wiring.2. External BI: Tableau, Power BI, dbt. Performance tips that take a dashboard from sluggish to instant, dbt configuration at incremental scale, the seam between BI and the lakehouse.3. AI/BI Dashboards. Anatomy of a Lakeview dashboard, draft vs published flow, the Dashboard Agent's reliable patterns, the five-grant permission model.4. Genie: Natural-Language Analytics. Grounding sources, the priority rule, the SQL Genie actually writes, the questions Genie answers cleanly versus the ones that confuse it.5. AI SQL Functions. ai_query, ai_parse_document, ai_extract for PDFs and HTML, univariate forecasts, the daily cost math for production AI SQL pipelines.6. Model Serving. Endpoints, the three fields that decide capacity and cost, the chat-completion payload, the five moving pieces of a production recommender.7. Foundation Models. Five major providers, the External Models config, the vendor-swap pattern (Claude to Gemini in hours, not weeks), the three habits that keep swap cost low.8. Vector Search and RAG. Six delta-sync arguments, three chunking strategies compared, the RAG function your app imports, end-to-end answer evaluation with traces.9. MLflow 3 and UC Model Registry. Versions, aliases, tags (and what each is not for), five tracking calls and what each one writes, the experiment-to-production lifecycle.10. Feature Store. Why SDP is the right producer, the six-file project layout, four parity-failure classes between offline and online stores and what causes each.11. MLOps as a Practice. Seven sources every incident reads from, three deploy patterns (canary, shadow, blue-green), three retrain strategies, five golden signals for an ML endpoint.12. Lakehouse Monitoring: Drift Detection. Six monitor parameters, the loop from drift alert to retraining, what to do when the baseline table is missing.13. Distributed Deep Learning. Three signals that force distributed training, picking the flavor (data, model, hybrid) from the bottleneck, four pieces of GPU memory worked out for a 7B model.14. Agent Bricks. Declarative classification and information-extraction agents, eval-set ingredients, the pre-compute pattern that makes small seed sets work.15. Multi-Agent Supervisor and MCP. The supervisor build, synthetic-turn evaluation, three real conversations end to end, the auth-passthrough chain across child agents.16. Lakebase: Operational Postgres for AI. Five alternatives compared, sub-10ms reads for AI apps, the lineage from Delta source through SDP into Postgres and onward to the endpoint.17. Capstone: Retail Intelligence App. Ten stages, each anchored to an earlier chapter. The smoke test that confirms every stage of the platform is reachable, the new-data path through the recommender.18. Certification and What's Next. The certification paths that actually map to the book, and the reading list the on-call team uses when something breaks.
The Databricks platform and data-engineering playbook for the engineers who own pipelines, govern catalogs, and keep workloads on schedule. Sixteen chapters on Unity Catalog, Lakeflow, identity, observability, and performance. Azure examples; concepts mapped to AWS and GCP.
Structured Streaming, MLlib, GraphFrames, performance tuning, testing and CI, and the lakehouse. Eleven chapters that take a competent PySpark user from "the job runs" to "the on-call team trusts the job.
PySpark from page one. Ten chapters that take a Python user who knows pandas and turn them into someone who can write, read, and debug production PySpark, without a three-chapter detour through distributed-computing theory.