Notes on Dynamical Systems for Actor-Critic Learning

Name: Notes on Dynamical Systems for Actor-Critic Learning
Brand: Leanpub
Price: 19.00 USD
Availability: InStock

A Dynamical Systems Approach to Reinforcement Learning Mean Dynamics

Vladyslav Prytula

An introduction to actor-critic algorithms as dynamical systems: featuring hand-computable examples, fast-slow reductions, and machine-checked Lean 4 proofs

Vladyslav Prytula

An introduction to actor-critic algorithms as dynamical systems: featuring hand-computable examples, fast-slow reductions, and machine-checked Lean 4 proofs

Minimum price

Free!

$29.00

You pay

Author earns

PDF

EPUB

WEB

APP

About

Notes on Dynamical Systems for Actor-Critic Learning

Minimum price

Free!

$29.00

You pay

Author earns

About

About the Book

Notes on Attractors in Actor-Critic Learning provides a rigorous, dynamical-systems treatment of finite-state actor-critic algorithms. By modeling the joint evolution of the actor, the critic, and the policy's state distribution together on an enlarged phase space, this book moves beyond simple point-convergence to analyze the global, asymptotic behavior of learning feedback loops.

Key Features:

Pedagogical Foundations: Opens with a hand-computable, two-state model (Chapter 0) to ground the abstract mathematics in concrete calculations.
Rigorous Analysis: Establishes well-posedness, compact absorbing sets, global attractors, and fast-slow reductions.
Fully Formalized Proofs: Backed by an axiom-free Lean 4 formalization of the core mathematical package.
Real-World Applications: Applies the theoretical framework to content recommendation systems (filter-bubble attractors) and endogenous network routing.

This is an attempt to bridge reinforcement learning, control theory, and formal methods for those who want to study the deep, unified dynamics of learning.

Share this book

Feedback

Email the Author

Author

About the Author

Vladyslav Prytula

Vladyslav Prytula is an applied mathematician who has spent his career moving between dynamical systems, homogenization of PDEs and machine learning. His training began in Kharkiv under Igor Chueshov, whose attractor theory sits underneath much of this book, and continued through a PhD and postdoctoral years in partial differential equations and infinite-dimensional dynamical system: well-posedness, global attractors, homogenization , from a doctoral student in Spain to an Abel Fellow and associate professor in Norway.

These days he is Principal ML/AI research scientist / Director of ML/AI at a European e-commerce company, building search, recommendation, and agentic systems at production scale. The book came out of a stubborn conviction that the way actor-critic methods converge is best understood not as an optimization trick but as the long-time behaviour of a coupled flow — actor, critic, and state distribution moving together on one phase space — and that you can write those dynamics down precisely, prove things about them, and have a machine check the proofs.

He lives in Munich, where most of his thinking happens on long trail runs and in the mountains.

Table of Contents

Foreword

Preface

What This Book Is About
Who This Book Is For
How The Running Example Works
How To Read This Book
Machine Verification

Notation And Dependencies

Symbol Table
Chapter Dependency Diagram
Conventions

Chapter 0: The Worked Example

0.1 Why We Start With An Example
0.2 Model Overview
0.3 The Environment
0.4 The Policy
0.5 The State Distribution And The Occupancy Measure
0.6 Why The Actor Drift Does Not Close On \theta Alone
0.7 The Critic Equation
0.8 The Actor Equation
0.9 The Distribution Equation
0.10 The Full Coupled System
0.11 Forward Invariance
0.12 The Absorbing Set And Boundedness
0.13 Equilibria
0.14 The Phase Portrait
0.15 Breaking The Symmetry
0.16 What The Attractor Contains
0.17 Summary And Bridge Forward
Exercises

Chapter 1: The Prerequisite Bridge

1.1 What The Example Showed And What It Left Open
1.2 From Algorithms To Flows
1.3 Semiflows
1.4 Forward Invariance
1.5 Absorbing Sets
1.6 Omega-Limit Sets
1.7 Global Attractors
1.8 Why The Enlarged State Space?
1.9 The Prescribed Closure Map
1.10 The Program Ahead
1.11 Summary Of Vocabulary
Exercises

Chapter 2: The General Model

2.1 The Softmax Policy And Its Score Function
2.2 The Generator Family And The Law Equation
2.3 Occupancy Measures And The Critic Equation
2.4 The Actor Drift And Boundary Damping
2.5 The Standing Assumptions
2.6 The Full System And The Phase Space
2.7 Recovery Of The Worked Example
2.8 A Three-State Retail-To-Vet Routing Example
2.9 Summary And Bridge Forward
Exercises

Chapter 3: Local Lipschitz Regularity and Well-Posedness

The Regularity Principle
3.1 Local Lipschitz Continuity of the Softmax
3.2 Local Lipschitz Continuity of the Actor Drift
3.3 Local Lipschitz Continuity of the Critic Drift
3.4 Local Lipschitz Continuity of the Law Field
3.5 The Ambient Extension and Picard-Lindelöf
3.6 From Local to Global Existence
3.7 Summary and Bridge Forward
Exercises

Chapter 4: A Priori Estimates

4.1 Actor-Box Forward Invariance
4.2 Simplex Forward Invariance
4.3 Critic Coercivity and the Energy Estimate
4.4 The Compact Absorbing Set
4.5 Global Existence and the Semiflow
4.6 Summary and Bridge Forward
Exercises

Chapter 5: The Global Attractor

5.1 The Omega-Limit Set Revisited
5.2 Nonemptiness Of \omega(K)
5.3 Compactness Of \omega(K)
5.4 Invariance Of \omega(K)
5.5 Attraction Of Bounded Sets
5.6 Uniqueness
5.7 The Prescribed-Closure Global Attractor Theorem
5.8 What The Attractor Contains And What It Does Not Determine
Exercises

Chapter 6: Bridge To The Genuine Controlled-Chain Closure

6.1 The Frozen Chain And Its Invariant Law
6.2 Uniform Exponential Mixing
6.3 Lipschitz Regularity Of The Invariant-Law Map
6.4 The Bridge Theorem
6.5 The Minorization Condition
6.6 Summary And Bridge Forward
Exercises

Chapter 7: Fast-Slow Reduction

7.1 The Two-Timescale Setup
7.2 The Pathwise Tracking Estimate
7.3 Upper Semicontinuity Of Attractors
7.4 The Minorization Sufficient Condition
Exercises

Chapter 8: Outlook And Open Problems

8.1 Non-Autonomous Forcing
8.2 Stochastic Perturbations
8.3 Closing Perspective
Exercises

Chapter 9: From Theory to Models

9.1 What Instantiation Means
9.2 The Model Specification Protocol
9.3 Feature Design and Attractor Geometry
9.4 Generator Construction from Domain Topology
9.5 Reading the Attractor in Domain Language
9.6 Preview of the Application Chapters
9.7 Chapter Summary
Exercises

Chapter 10: Recommendation Systems and Algorithmic Curation

10.1 The Recommendation Problem as a Dynamical System
10.2 Model Specification: States, Actions, and Features
10.3 Rewards and the Engagement-Diversity Tension
10.4 The Generator Family: A Y-Graph Controlled Chain
10.5 The Full Recommendation System
10.6 Equilibria and Filter Bubbles
10.7 What the Theory Reveals
10.8 Summary and Bridge Forward
Exercises

Chapter 11: Network Routing Under Endogenous Traffic

11.1 From the Retail-Vet Chain to a Full Network
11.2 Model Specification: The Hub-and-Spoke Network
11.3 The Generator Family: Network Topology as Generator Structure
11.4 Reference-State Minorization and the Bridge Theorem
11.5 The Full Routing System and Its Equilibria
11.6 Attractor Structure and Routing Policy Design
11.7 Summary and Bridge to Appendix B
Exercises

Appendix A: Lean Formalization Structure

Verification
Paper-to-Lean Mapping
File Layout
Key Design Decisions
Scope of the Unconditional Claim
Reading Order

Appendix B: Computational Methods and Phase Portrait Blueprint

B.1 Numerical Integration of the Model System
B.2 Equilibrium Finding and Nullcline Computation
B.3 Phase Portrait Blueprint: The Chapter 0 and Section 2.8 Models
B.4 Phase Portrait Blueprint: The Recommendation and Routing Models
B.5 Parameter Continuation and Bifurcation Sketches

References

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

See full terms...

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earned over $15 million writing, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub

You pay

Author earns

About

Share this book

Categories

Feedback

Author

Contents

Foreword

Preface

Notation And Dependencies

Chapter 0: The Worked Example

Chapter 1: The Prerequisite Bridge

Chapter 2: The General Model

Chapter 3: Local Lipschitz Regularity and Well-Posedness

Chapter 4: A Priori Estimates

Chapter 5: The Global Attractor

Chapter 6: Bridge To The Genuine Controlled-Chain Closure

Chapter 7: Fast-Slow Reduction

Chapter 8: Outlook And Open Problems

Chapter 9: From Theory to Models

Chapter 10: Recommendation Systems and Algorithmic Curation

Chapter 11: Network Routing Under Endogenous Traffic

Appendix A: Lean Formalization Structure

Appendix B: Computational Methods and Phase Portrait Blueprint

References

The Leanpub 60 Day 100% Happiness Guarantee

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

Free Updates. DRM Free.

Write and Publish on Leanpub