Leanpub Header

Skip to main content

Generative AI for Science

A Hands-On Guide for Students and Researchers

Bridge AI and science with this hands-on guide. Whether you're a researcher learning ML or an engineer entering scientific applications, build real systems across chemistry, biology, physics & climate. Master Transformers, Diffusion Models & GNNs for scientific discovery. 500+ pages, 50+ Colab notebooks. Design molecules, predict proteins, accelerate climate models—all hands-on, zero setup required.

Minimum price

$29.00

$39.00

You pay

$39.00

Author earns

$31.20
$

...Or Buy With Credits!

You can get credits with a paid monthly or annual Reader Membership, or you can buy them here.
PDF
EPUB
About

About

About the Book

Design molecules. Predict protein structures. Accelerate climate models. All with AI.

Generative AI is transforming scientific discovery. AI-designed drugs now achieve 80-90% success rates in Phase I trials. AlphaFold's protein structure predictions earned the 2024 Nobel Prize in Chemistry. Neural network weather models outperform traditional supercomputer simulations in 97% of scenarios—and run 1000x faster.

This book teaches you how to build these systems yourself.

Generative AI for Science is a comprehensive, hands-on guide for researchers, students, and practitioners who want to apply cutting-edge AI to real scientific problems. Across 500+ pages and 13 chapters, you'll master the architectures powering the AI revolution—Transformers, Diffusion Models, VAEs, Graph Neural Networks, and Physics-Informed Neural Networks—through 50+ runnable Google Colab notebooks that require zero setup.

What Makes This Book Different

This isn't a traditional AI textbook heavy on theory and light on application. Every concept is paired with working code. Every technique is demonstrated on authentic scientific problems. Whether you're a domain scientist learning AI or an ML engineer entering scientific applications, you'll find the right level of depth.

The material is battle-tested. It originated in graduate courses at the Data Science and AI Academy, was refined through workshops at the Bioinformatics Research Center, and has been validated by hundreds of scientists applying these tools to their own research.

What You'll Build

In just 30 minutes with each notebook, you'll create:

  • Drug discovery pipelines using Graph Neural Networks and Diffusion Models
  • Protein structure predictors with ESMFold and AlphaFold-inspired architectures
  • Climate and weather emulators using neural surrogates
  • Physics simulations with PINNs that encode conservation laws
  • Literature mining systems using RAG and Large Language Models
  • Multimodal scientific AI combining images, text, and molecular graphs

Who This Book Is For

  • Domain scientists (chemists, biologists, physicists, geoscientists) who want AI skills to accelerate their research
  • ML engineers and data scientists seeking meaningful scientific applications
  • Graduate students looking for a complete curriculum with hands-on projects
  • Industry practitioners who need production-ready code and best practices

What You'll Learn

By the end of this book, you will:

  • Understand the key architectures powering scientific AI
  • Represent molecules, proteins, sequences, and physical systems for neural networks
  • Apply generative models across chemistry, biology, physics, and climate science
  • Fine-tune foundation models for domain-specific tasks
  • Build multimodal systems that combine vision, language, and structured data
  • Follow best practices for ethics, reproducibility, and deployment
  • Develop the intuition to know when and how to apply AI to your research

Prerequisites

  • Basic Python programming
  • Undergraduate-level statistics (helpful but not required)
  • A web browser and curiosity
  • No prior deep learning experience needed

Access All Code

All 50+ notebooks and PowerPoint slides are available on GitHub:https://github.com/jpliu168/Generative_AI_For_Science

Every example runs in Google Colab with one click—no installation, no configuration, no GPU setup required.

"Generative AI does not replace the scientific method—it enhances it. It expands the space of hypotheses we can explore, sharpens experimental design, and reveals patterns hidden in complexity. Combine human creativity with machine assistance, and new discoveries become possible."

— Dr. J. Paul Liu

Author

About the Author

J. Paul Liu

Professor at NC State University

Contents

Table of Contents

From the Author

  1. A Journey from Classroom to Lab — and Now to You
  2. Origins
  3. What Makes This Book Different
  4. How to Use This Book
  5. What You Will Learn
  6. A Note on Collaboration
  7. Downloading all codes or slides
  8. Prerequisites
  9. AI Use Disclaimer
  10. Writing Workflow

Chapter 1 — Generative AI: A New Frontier for Scientific Discovery

  1. The New Frontier of Scientific Discovery
  2. The AI Revolution in Scientific Discovery
  3. What Makes Generative AI Different from Traditional ML?
  4. Core Technologies Powering Generative AI
  5. The Pre-Training Revolution
  6. Generative AI Across Scientific Disciplines
  7. Mathematical Foundations and Methods
  8. Cross-Cutting Capabilities
  9. A New Scientific Partner
  10. The Path Forward
  11. References and Further Readings

Chapter 2: Generative AI Fundamentals

  1. Introduction: The Building Blocks of Generation
  2. The Three Pillars of Generative AI
  3. Part I: Transformers and Large Language Models
  4. Part II: Diffusion Models and Flow Matching
  5. Part III: VAEs and GANs
  6. Part IV: Pre-Training and Fine-Tuning
  7. Part V: Mathematical Foundations
  8. Part VI: Types of Generative AI by Modality
  9. Design Principles for Scientific Applications
  10. Practical Considerations
  11. Summary
  12. References

Chapter 3: Scientific Data & Workflows

  1. Introduction: The Data Challenge in Science
  2. Part I: Unique Challenges of Scientific Data
  3. Part II: Data Sources in Science
  4. Part III: The FAIR Principles
  5. Part IV: Data Preparation for AI
  6. Part V: Integrating AI into Research Workflows
  7. Part VI: Automated Workflow Generation
  8. Summary
  9. References

Chapter 4: Text, Code & Knowledge Generation for Scientists

  1. Introduction: The Knowledge Synthesis Challenge
  2. Part I: Literature Review and Synthesis
  3. Part II: Retrieval-Augmented Generation (RAG)
  4. Part III: Hypothesis Generation
  5. Part IV: Code Generation for Research
  6. Part V: Scientific Writing Assistance
  7. Part VI: Educational Applications
  8. Part VII: Domain-Specific LLM Systems
  9. Part VIII: Limitations and Best Practices
  10. Summary
  11. References

Chapter 5: Data-to-Data Models

  1. Introduction: The Data Scarcity Problem
  2. Part I: Missing Data Imputation with Autoencoders
  3. Part II: Synthetic Data Generation with GANs
  4. Part III: Variational Autoencoders (VAEs)
  5. Part IV: Gaussian Process Spatial Interpolation
  6. Part V: Time Series Gap Filling
  7. Summary and Key Takeaways
  8. Next Steps
  9. References

Chapter 6: Physics-Informed AI and Simulation

  1. Introduction: Embedding Physics in Neural Networks
  2. Part I: Physics-Informed Neural Networks (PINNs)
  3. Part III Neural Network Surrogates for Simulations
  4. Part IV: Code Optimization with AI
  5. Part V: Automated Test Generation
  6. Summary
  7. References

Chapter 7: Domain Applications in Chemistry, Biology, Physics and Geoscience

  1. Introduction: Generative AI Across the Sciences
  2. Part I: Chemistry & Materials Science
  3. Summary
  4. References
  5. Part II: Biology & Biomedicine
  6. Summary: Biology & Biomedicine
  7. References
  8. Part III: Physics & Engineering
  9. Summary: Physics & Engineering
  10. References
  11. Part IV: Geoscience & Climate Applications
  12. Summary: Geoscience & Climate Applications
  13. References
  14. Part V: Cross-Cutting Applications in Deep Learning
  15. Summary: Cross-Cutting Applications
  16. References

Chapter 8: Fine-Tuning & Domain Adaptation

  1. Introduction: Making General Models Domain-Specific
  2. Part I: Why Fine-Tuning Works for Science
  3. Part II: Parameter-Efficient Fine-Tuning (PEFT)
  4. Part III: Practical Results - Biology Text Fine-Tuning
  5. Part IV: Preparing Domain-Specific Training Data
  6. Part V: Evaluation and Validation
  7. Note: The dataset updates in real time, so each run may yield different results.
  8. Part VI: Best Practices and Lessons Learned
  9. Summary
  10. References

Chapter 9: Multimodal Generative AI for Sciences

  1. Introduction: Beyond Single-Modality AI
  2. Part I: Vision-Language Models for Science
  3. Part II: Graph-Text Models for Molecules
  4. Part III: Time Series with Textual Context
  5. Part IV: Multimodal Fusion Architectures
  6. Part V: Scientific Document Understanding
  7. Part VI: Training Multimodal Scientific Models
  8. Part VII: Practical Applications
  9. Summary
  10. References

Chapter 10: Evaluation, Validation & Benchmarking

  1. Introduction: Trust Through Rigorous Assessment
  2. Part I: Core Evaluation Metrics
  3. Part II: Validation Strategies
  4. Part III: Benchmarking Datasets and Tasks
  5. Part IV: Human Evaluation
  6. Part V: Uncertainty Quantification
  7. Part VI: Failure Analysis
  8. Part VII: Robustness Testing
  9. Summary
  10. References

Chapter 11: Ethics & Responsible AI for Science

  1. 📖 How to Use This Chapter
  2. 📊 Code Quick Reference
  3. Introduction: The Unique Responsibility of Scientific AI
  4. Part I: Reproducibility and Open Science
  5. Part II: Bias and Fairness in Scientific AI
  6. Part III: Environmental Impact of AI
  7. Part IV: Dual-Use and Biosecurity
  8. Part V: Data Privacy in Scientific Research
  9. Part VI: Attribution and Scientific Integrity
  10. Part VII: Equity and Access
  11. Summary
  12. References

Chapter 12: Deployment & MLOps for Scientific Applications

  1. Introduction: From Notebooks to Production Science
  2. Part I: Experiment Tracking & Management
  3. Part II: Data Versioning & Lineage
  4. Part III: Model Lifecycle Management
  5. Part IV: Continuous Training Pipelines
  6. Part V: Scientific Validation & Testing
  7. Part VI: Deployment to Scientific Infrastructure
  8. Part VII: Monitoring Production Models
  9. References

Chapter 13: Future Directions & Conclusion

  1. Introduction: Science at the Dawn of the AI Era
  2. Part I — Emerging Architectures & Techniques
  3. Part II — Multimodal Scientific AI
  4. Part III — Foundation Models for Science
  5. Part IV — AI for Scientific Reasoning
  6. Part V — Open Challenges (Grouped)
  7. Part VI — A Vision for the Next Decade
  8. Part VII — Conclusion: The Scientific Method, Amplified
  9. References

Acknowledgements

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earned over $14 million writing, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub