Regular Expressions Cookbook, 2nd Edition – Seek, find, fix–in 8 programming languages – #bookreview

Regular Expressions Cookbook, 2nd Edition
Jan Goyvaerts and Steven Levithan
(O’Reilly, paperbackKindle)

Recently revised and updated, this new edition of the Regular Expressions Cookbook offers “detailed solutions” for using regular expressions in eight popular programming languages: C#, Java, JavaScript, Perl, PHP, Python, Ruby, and VB.NET.

A regular expression basically is a pattern that describes a certain amount of text. And: “Regular expressions are a powerful tool,” the two authors note. “If your job involves manipulating or extracting text on a computer, a firm grasp of regular expressions will save you plenty of overtime.”

Programmers and non-programmers alike can use regular expressions “for information retrieval and alteration tasks,” and no prior experience is required to use the first two chapters of this book. Chapters 1 and 2 explain major concepts, basic skills, and many of the available regex software tools that non-programmers can use to work with regular expressions.

For programmers, Chapters 3 through 9 focus on using the book’s eight supported programming languages to implement and work with regular expressions. And there are numerous code examples.

The Regular Expressions Cookbook is well-written and well-illustrated, and it delivers more than 140 “recipes” that show how to apply regular expression concepts and tools to real-world problems.

One Cookbook example: “You want to catch addresses that contain a P.O. box, and warn users that their shipping information must contain a street address.”

Another example: “You want to find URLs in a large body of text. URLs may or may not be enclosed in punctuation that is part of the larger body of text rather than part of the URL. You want to correctly match URLs that include pairs of parentheses as part of the URL, without matching parentheses placed around the entire URL.”

The Regular Expressions Cookbook explains why there are now many different “flavors” of regular expressions. And, while some programming languages “have their own, built-in regular expression flavor….[other] programming languages rely on libraries for regex support.” The authors emphasize: “For this book , we selected the most popular regex flavors in use today.”

Si Dunn

Advertisements

Introducing Regular Expressions – Finding your perfect match…in strings – #bookreview

Introducing Regular Expressions
Michael Fitzgerald
(O’Reilly, paperbackKindle)

“Regular expressions are specially encoded text strings used as patterns for matching sets of strings,” Michael Fitzgerald writes in this example-rich new book that focuses on learning by doing.

Veteran programmers who work with Perl, Java, JavaScript, C# and a number of Unix utilities often consider regular expressions to be an important part of their toolkit. Ruby 1.9 and Python 3 also support regular expressions.

“Regular expressions have a reputation for being gnarly,” Fitzgerald notes. However, using the online Regexpal JavaScript regular expression tester, he shows you how to dive right into the very basics and start working your way up.

He introduces several other applications that let you work with regular expressions. And his chapters smoothly take you from matching single digits to matching text strings, number strings, boundaries such as the beginnings or endings of words, character classes, and beyond, including white-space patterns and Unicode. He also shows how to perform some fairly esoteric operations such as “negative lookaheads,” where you verify that a certain pattern of text or digits does not appear in a string ahead of certain other text, numbers, or other qualifiers.

The 136-page book has ten chapters:

  1. What Is a Regular Expression?
  2. Simple Pattern Matching
  3. Boundaries
  4. Alternation, Groups, and Backreferences
  5. Character Classes
  6. Matching Unicode and Other Characters
  7. Quantifiers
  8. Lookarounds
  9. Marking Up a Document with HTML
  10. The End of the Beginning

An appendix provides a regular expression reference, listing such items as control characters, Unicode whitespace characters, metacharacters, and others. There is also a glossary of regular expression terms, such as “greedy match” and “zero-width assertions.”

Fitzgerald recommends his book for those who are “new to regular expressions or programming…the reader who has heard of regular expressions and is interested in them but who really doesn’t understand them yet.”

Those who are a bit beyond the beginner level, however, likewise can benefit from Introducing Regular Expressions and its handy examples and how-to summaries.

Si Dunn