Idiosyncrasies of the HTML parser
This book is 50% complete
Last updated on 2020-01-12
About the Book
The HTML parser is a piece of software that processes HTML markup and produces an in-memory tree representation (known as the DOM).
The HTML parser has many strange behaviors. This book will highlight the ins and outs of the HTML parser, and contains almost-impossible quizzes.
HTML is not only used by basically all of the web, but it is also part of many modern applications. The HTML parser is part of the foundation of the web platform.
Table of Contents
-
Preface
- Intended audience
- Definition
- Scope
- Practical application
- About the author
- Acknowledgements
- Contribute
-
Chapter 1. Introduction
- The DOM, parsing, and serialization
- History of HTML parsers
- The HTML parser is specified
- The HTML syntax
-
Chapter 2. The HTML parser
- Overview of the HTML parser
- Error handling
- Detecting character encoding
- Preprocessing the input stream
- Tokenizer
- Tree construction
- Scripting
- Speculative parsing
- Tags that are no longer supported
-
Chapter 3. Microsyntaxes
- Numbers
- Image map coordinates
- Responsive images
- Colors
- Meta refresh
-
Chapter 4. DOM manipulation
- Using DOM APIs
- Using the template element
- Sanitizing HTML
- Appeasing the XML gods
- Chapter 5. Serializing
- Appendix A. Implementations
-
Appendix B. Conformance checkers
- DTD-based validators
- Validator.nu
Causes Supported

Amazon Watch
Supporting Indigenous Peoples. Protecting the Amazon.
http://amazonwatch.orgAmazon Watch is a nonprofit organization founded in 1996 to protect the rainforest and advance the rights of indigenous peoples in the Amazon Basin. We partner with indigenous and environmental organizations in campaigns for human rights, corporate accountability and the preservation of the Amazon's ecological systems.
Authors have earned$8,450,276writing, publishing and selling on Leanpub,
earning 80% royalties while saving up to 25 million pounds of CO2 and up to 46,000 trees.
Learn more about writing on Leanpub
The Leanpub 45-day 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers), EPUB (for phones and tablets) and MOBI (for Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them