Parsing with Perl 6 Regexes and Grammars
Parsing with Perl 6 Regexes and Grammars
A Recursive Descent into Parsing
About the Book
As humans, we are incredibly good at finding patterns. So we assume that as programmers, we must be good at using patterns to parse text. But it's a skill that needs to be learned. This book aims to teach you how to write good regexes and parsers with Perl 6.
It starts from the very basics of regular expressions, and then explores how they integrate with regular Perl 6 code. The result of a successful regex match is a Match object, which contains all the useful information for extracting data, so that is the next topic.
Then we discuss how regexes work under the hood, and how Perl 6 uses a mixture of finite state machines and backtracking to do its magic. With this understanding, we can explore common techniques for constructing regexes and exploring the data under scrutiny.
So far, the regexes can only match relatively simple formats. But with reusable named regexes and grammars, the sky is the limit. Figuratively, of course. We discuss techniques for code reuse in grammars, and how to write parser for more involved data formats.
One of my favorite topics is the generation of good error messages for when the input can't be parsed by a grammar, so there will be a separate chapter on that.
Table of Contents
- 1. This Book Will Be Published By Apress
2. What are Regexes and Grammars?
- 2.1 Use Cases
- 2.2 Regexes or Regular Expressions?
- 2.3 What’s So Special about Perl 6 Regexes?
3. Getting Started with Perl 6
- 3.1 Installing Rakudo Perl 6
- 3.2 Using Rakudo Perl 6
- 3.3 Summary
4. Building Blocks of Regexes
- 4.1 Literals
- 4.2 Meta Characters vs. Literals
- 4.3 Anchors
- 4.4 Pre-Defined Character Classes
- 4.5 Quantifiers
- 4.6 Disjunction
- 4.7 Conjunction
- 4.8 Zero-Width Assertions
- 4.9 Summary
5. Regexes and Perl 6 Code
- 5.1 Smart-Matching
- 5.2 Modifiers and Quote Forms
- 5.3 Comb and Split
- 5.4 Substitution
- 5.5 Crossing the Code and Regex Boundary
- 5.6 Summary
6. Extracting Data from Regex Matches
- 6.1 Positional Captures
- 6.2 The Match Object
- 6.3 Named Captures
- 6.4 Backreferences
- 6.5 Match Objects Revisited
- 6.6 Summary
7. Regex Mechanics
- 7.1 Matching with State Machines
- 7.2 Regex Control Flow
- 7.3 Backtracking
- 7.4 Why Would You Want to Avoid Backtracking?
- 7.5 Frugal Quantifiers and Backtracking
- 7.6 Longest Token Matching
- 7.7 Summary
8. Regex Techniques
- 8.1 Know your Data Format
- 8.2 Think about Invalid Inputs
- 8.3 Use Anchors
- 8.4 Matching Quoted Strings
- 8.5 Testing Regexes
- 8.6 Summary
9. Reusing and Composing Regexes
- 9.1 Named Regexes
- 9.2 Whitespace
- 9.3 Grammars
- 9.4 Code Reuse with Grammars
- 9.5 Proto Regexes
- 9.6 Summary
10. Parsing With Grammars
- 10.1 Understanding Grammars
- 10.2 Starting Simple
- 10.3 Assembling Complete Grammars
- 10.4 Parsing Whitespace and Comments
- 10.5 Keeping State
- 10.6 Summary
11. Extracting Data From Matches
- 11.1 Action Objects
- 11.2 Building ASTs with Action Objects
- 11.3 Keeping State in Action Objects
- 11.4 Summary
- 12. Generating Good Error Messages
- 13. Acknowledgements
The Leanpub 60-day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
80% Royalties. Earn $16 on a $20 book.
We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.
(Yes, some authors have already earned much more than that on Leanpub.)
In fact, authors have earnedover $12 millionwriting, publishing and selling on Leanpub.
Learn more about writing on Leanpub
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them