Parsing with Perl 6 Regexes and Grammars
This book is 65% complete
Last updated on 2018-12-29
About the Book
As humans, we are incredibly good at finding patterns. So we assume that as programmers, we must be good at using patterns to parse text. But it's a skill that needs to be learned. This book aims to teach you how to write good regexes and parsers with Perl 6.
It starts from the very basics of regular expressions, and then explores how they integrate with regular Perl 6 code. The result of a successful regex match is a Match object, which contains all the useful information for extracting data, so that is the next topic.
Then we discuss how regexes work under the hood, and how Perl 6 uses a mixture of finite state machines and backtracking to do its magic. With this understanding, we can explore common techniques for constructing regexes and exploring the data under scrutiny.
So far, the regexes can only match relatively simple formats. But with reusable named regexes and grammars, the sky is the limit. Figuratively, of course. We discuss techniques for code reuse in grammars, and how to write parser for more involved data formats.
One of my favorite topics is the generation of good error messages for when the input can't be parsed by a grammar, so there will be a separate chapter on that.
- 1. This Book Will Be Published By Apress
2. What are Regexes and Grammars?
- 2.1 Use Cases
- 2.2 Regexes or Regular Expressions?
- 2.3 What’s So Special about Perl 6 Regexes?
3. Getting Started with Perl 6
- 3.1 Installing Rakudo Perl 6
- 3.2 Using Rakudo Perl 6
- 3.3 Summary
4. Building Blocks of Regexes
- 4.1 Literals
- 4.2 Meta Characters vs. Literals
- 4.3 Anchors
- 4.4 Pre-Defined Character Classes
- 4.5 Quantifiers
- 4.6 Disjunction
- 4.7 Conjunction
- 4.8 Zero-Width Assertions
- 4.9 Summary
5. Regexes and Perl 6 Code
- 5.1 Smart-Matching
- 5.2 Modifiers and Quote Forms
- 5.3 Comb and Split
- 5.4 Substitution
- 5.5 Crossing the Code and Regex Boundary
- 5.6 Summary
6. Extracting Data from Regex Matches
- 6.1 Positional Captures
- 6.2 The Match Object
- 6.3 Named Captures
- 6.4 Backreferences
- 6.5 Match Objects Revisited
- 6.6 Summary
7. Regex Mechanics
- 7.1 Matching with State Machines
- 7.2 Regex Control Flow
- 7.3 Backtracking
- 7.4 Why Would You Want to Avoid Backtracking?
- 7.5 Frugal Quantifiers and Backtracking
- 7.6 Longest Token Matching
- 7.7 Summary
8. Regex Techniques
- 8.1 Know your Data Format
- 8.2 Think about Invalid Inputs
- 8.3 Use Anchors
- 8.4 Matching Quoted Strings
- 8.5 Testing Regexes
- 8.6 Summary
9. Reusing and Composing Regexes
- 9.1 Named Regexes
- 9.2 Whitespace
- 9.3 Grammars
- 9.4 Code Reuse with Grammars
- 9.5 Proto Regexes
- 9.6 Summary
10. Parsing With Grammars
- 10.1 Understanding Grammars
- 10.2 Starting Simple
- 10.3 Assembling Complete Grammars
- 10.4 Parsing Whitespace and Comments
- 10.5 Keeping State
- 10.6 Summary
11. Extracting Data From Matches
- 11.1 Action Objects
- 11.2 Building ASTs with Action Objects
- 11.3 Keeping State in Action Objects
- 11.4 Summary
- 12. Generating Good Error Messages
- 13. Acknowledgements
The Leanpub 45-day 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers), EPUB (for phones and tablets) and MOBI (for Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.