From the Author

A Journey from Classroom to Lab β€” and Now to You

This book is the result of years of teaching, research, and hands-on collaboration at the intersection of generative AI and scientific discovery. Everything here has been shaped by real use: tested in graduate courses, refined in workshops, improved through research projects, and validated by hundreds of scientists applying these tools to their own work.

Origins

The material began in my graduate course Generative AI for Science, taught at the Data Science and AI Academy, where students from biology, chemistry, physics, materials science, and computational fields explored a fundamental question:

How can AI not only analyze data, but generate new scientific knowledge?

Lecture notes grew into full tutorials as we worked through real challenges: predicting protein stability, designing new materials, accelerating climate simulations, and mining insights from vast scientific corpora. Student questions sharpened the explanations; their projects highlighted what worked; their struggles revealed where clarity was needed.

Beyond the classroom, the methods in this book have been strengthened through:

  • Bioinformatics Research Center workshops
  • Cross-campus AI for Research training programs
  • Research Triangle AI Society–LLM intensive bootcamps
  • Active collaborations in oceanography, materials science, protein engineering, and literature mining

What Makes This Book Different

This book is not a traditional AI textbook. Rather than focusing on heavy theory, it provides a hands-on, practical guide to using generative AI in real scientific workflows through educational case studies. It is ideal for college students, researchers, and data scientists who want to learn by doing.

  • Theory meets practice: Every concept is paired with ready-to-run code.
  • Interactive learning: All example codes are provided as Google Colab notebooksβ€”no installation required.
  • Real scientific problems: Examples come from authentic research.
  • Accessible yet rigorous: Suitable for both domain scientists exploring AI and ML experts entering scientific applications.

How to Use This Book

As a course text:
Follow the chapters sequentially for a structured introduction.

As a reference:
Jump directly to the sections relevant to your research domain, or follow the further reading references to explore topics in greater detail.

As a hands-on guide:
Open the Colab notebooks or Powerpoint slides alongside each chapter, run the code, and modify it as you learn.

As a research launchpad:
Use the provided implementations as starting points for your own projects.

What You Will Learn

By the end of this book, you will:

  • Understand key AI architectures such as Transformers, Diffusion Models, VAEs, and GNNs
  • Represent scientific data types effectively
  • Apply generative models to problems in climate science, drug discovery, genomics, materials science, and more
  • Follow best practices around ethics, reproducibility, and deployment
  • Stay current with emerging methods and future directions

Most importantly, you will develop the intuition to know when and how to apply AI to scientific research.

A Note on Collaboration

AI-accelerated science thrives at the intersection of domain expertise and computational skill. Biologists will gain enough ML understanding to explore confidently, while ML practitioners will gain enough scientific context to ask meaningful questions and validate results. This book is itself a perfect example of how I collaborate with AI on writing, coding, editing, and proofreading.

Collaboration is the catalyst. This book aims to provide the shared language needed to bridge fields and perspectives.

Let’s Begin

Generative AI does not replace the scientific methodβ€”it enhances it. It expands the space of hypotheses we can explore, sharpens experimental design, and reveals patterns hidden in complexity. But the foundations remain unchanged: rigor, honesty, and careful interpretation.

Combine human creativity with machine assistance, and new discoveries become possible.

Now, let us explore them together.

Dr. J. Paul Liu December 2025

* * *

Downloading all codes or slides

Access all Colab notebooks and slides: https://github.com/jpliu168/Generative_AI_For_Science

Prerequisites

  • Basic Python programming (functions, loops, data structures)
  • Undergraduate-level statistics (distributions, hypothesis testing)
  • Familiarity with scientific computing (NumPy, Matplotlib helpful but not required)
  • No prior deep learning experience necessary

Technical requirements: A web browser and curiosity. Everything else runs in the cloud.

AI Use Disclaimer

This work was developed with the assistance of modern AI language models, including ChatGPT, Claude, and Gemini, which supported:

  1. Improving clarity of writing, including corrections to English grammar and syntax
  2. Text organization and formatting, including Markdown structure, embedded Python code blocks, output results, and visual styling
  3. Code generation, formatting, debugging, and proofreading

Writing Workflow

Here is the workflow I followed for most chapters:

  1. Research and preparation: Based on my previous class notes, sample codes, and PowerPoint slides, I first conducted extensive Google searches and read through hundreds of links, papers, and reports.

  2. Initial AI consultation: I posed my questions, ideas, and plans to AI assistants. With their suggestions, I refined my questions and ideas, returned to Google searches and reference readings, and then asked AI for help with the book proposal.

  3. Drafting: For each chapter, I sent AI the topic, questions, my references, and sample codes to generate an initial draft. I then manually edited and updated the content by combining my own data, new references, and updated codes.

  4. Code development: I spent the majority of my time coding, debugging, customizing, and finalizing each case study to ensure it could run within typical student computational resources for educational demonstration purposes.

  5. Validation: I shared some chapters with students and colleagues for their input. Based on their feedback, I revised and updated the content further.

  6. Final polish: AI assisted in polishing all my writing and reformatting each section to fit the Leanpub Markua format.

Many times, to meet this book’s educational demo purpose, I had to compare, adjust, or combine results from different LLMs based on the quality of their responses. I retain full responsibility for all conceptual content, workflows, research interpretations, citations, discussions, conclusions, and final decisions.

Through this intensive collaborative writing and coding process, I learned a great deal and cannot wait to share all of this in my next class offering of Generative AI for Science.