Parsing with Perl 6 Regexes and Grammars


This book is no longer available for sale.

Parsing with Perl 6 Regexes and Grammars

A Recursive Descent into Parsing

About the Book

As humans, we are incredibly good at finding patterns. So we assume that as programmers, we must be good at using patterns to parse text. But it's a skill that needs to be learned. This book aims to teach you how to write good regexes and parsers with Perl 6.

It starts from the very basics of regular expressions, and then explores how they integrate with regular Perl 6 code. The result of a successful regex match is a Match object, which contains all the useful information for extracting data, so that is the next topic.

Then we discuss how regexes work under the hood, and how Perl 6 uses a mixture of finite state machines and backtracking to do its magic. With this understanding, we can explore common techniques for constructing regexes and exploring the data under scrutiny.

So far, the regexes can only match relatively simple formats. But with reusable named regexes and grammars, the sky is the limit. Figuratively, of course. We discuss techniques for code reuse in grammars, and how to write parser for more involved data formats.

One of my favorite topics is the generation of good error messages for when the input can't be parsed by a grammar, so there will be a separate chapter on that.

About the Author

Moritz Lenz
Moritz Lenz

Moritz Lenz is a software engineer and architect.

In the Perl community, he is well known for his contributions to the Perl 6 programming language, the Rakudo Perl 6 compiler, related test suite, infrastructure and tools.

At his employer, noris network AG, he introduced Continuous Delivery for many in-house developed applications, and now wants to share his experience with the wider world.

Table of Contents

  • 1. This Book Will Be Published By Apress
  • 2. What are Regexes and Grammars?
    • 2.1 Use Cases
    • 2.2 Regexes or Regular Expressions?
    • 2.3 What’s So Special about Perl 6 Regexes?
  • 3. Getting Started with Perl 6
    • 3.1 Installing Rakudo Perl 6
    • 3.2 Using Rakudo Perl 6
    • 3.3 Summary
  • 4. Building Blocks of Regexes
    • 4.1 Literals
    • 4.2 Meta Characters vs. Literals
    • 4.3 Anchors
    • 4.4 Pre-Defined Character Classes
    • 4.5 Quantifiers
    • 4.6 Disjunction
    • 4.7 Conjunction
    • 4.8 Zero-Width Assertions
    • 4.9 Summary
  • 5. Regexes and Perl 6 Code
    • 5.1 Smart-Matching
    • 5.2 Modifiers and Quote Forms
    • 5.3 Comb and Split
    • 5.4 Substitution
    • 5.5 Crossing the Code and Regex Boundary
    • 5.6 Summary
  • 6. Extracting Data from Regex Matches
    • 6.1 Positional Captures
    • 6.2 The Match Object
    • 6.3 Named Captures
    • 6.4 Backreferences
    • 6.5 Match Objects Revisited
    • 6.6 Summary
  • 7. Regex Mechanics
    • 7.1 Matching with State Machines
    • 7.2 Regex Control Flow
    • 7.3 Backtracking
    • 7.4 Why Would You Want to Avoid Backtracking?
    • 7.5 Frugal Quantifiers and Backtracking
    • 7.6 Longest Token Matching
    • 7.7 Summary
  • 8. Regex Techniques
    • 8.1 Know your Data Format
    • 8.2 Think about Invalid Inputs
    • 8.3 Use Anchors
    • 8.4 Matching Quoted Strings
    • 8.5 Testing Regexes
    • 8.6 Summary
  • 9. Reusing and Composing Regexes
    • 9.1 Named Regexes
    • 9.2 Whitespace
    • 9.3 Grammars
    • 9.4 Code Reuse with Grammars
    • 9.5 Proto Regexes
    • 9.6 Summary
  • 10. Parsing With Grammars
    • 10.1 Understanding Grammars
    • 10.2 Starting Simple
    • 10.3 Assembling Complete Grammars
    • 10.4 Parsing Whitespace and Comments
    • 10.5 Keeping State
    • 10.6 Summary
  • 11. Extracting Data From Matches
    • 11.1 Action Objects
    • 11.2 Building ASTs with Action Objects
    • 11.3 Keeping State in Action Objects
    • 11.4 Summary
  • 12. Generating Good Error Messages
  • 13. Acknowledgements
  • Notes

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

80% Royalties. Earn $16 on a $20 book.

We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earnedover $13 millionwriting, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub