Amazon.co.uk Review
The simple name hides an incredible amount of power when handling textual data with scripting languages such as Perl, Python and awk and more and the programmer that can master regular expressions can master just about anything.
From the off it's necessary to congratulate author Jeffrey Friedl on doing a superb job of asking what can be a very complex subject and breaking it down into digestible chunks that almost anyone can understand.
From the basics of character and pattern matching through to the recognition of complex string patterns and multiple character replacements to "greedy" metacharacters and how to curb their appetite, this is about as comprehensive as it gets.
With a handful of latter chapters devoted to the differences between scripting languages and the way in which they deal with regular expressions and so many examples it'll make your eyes water there's something here for everyone.
So, if you can examine a string like this "(\\.|[^"\\])*" and know what it does and how it does it there's plenty of reference material in here for those odd moments when you need a refresher. If, however, you've no idea what the above means, and you need the ability to handle textual data, buy this book. Now!
David Wller, Java Developers Journal Jan 2002
you need to know bout the mysterious regex
Major Keary, Book News, 2002 No 5
James Lance, Provo Linux User Group, May 2002
Leo LaPorte, TechTV, July 16, 2002
Kevin Taylor, Northants Linux Users Group, August 2002
Huw Collingbourne, PC Plus, March 2003
Jim Secan, The Journal of the Tucson Computer Society, Jan 2003
Jason Menard, javaranch.com, March 2003
Product Description
Regular expressions are a powerful tool for manipulating text and data. If you don't use them yet, you will discover in this book a whole new world of mastery over your data. If you already use them, you'll appreciate this book's unprecedented detail and breadth of coverage. If you think you know all you need to know about regular expressions, this book is a stunning eye-opener.
With regular expressions, you can save yourself time and aggravation while dealing with documents, mail messages, log files -- you name it -- any type of text or data. For example, regular expressions can play a vital role in constructing a World Wide Web CGI script, which can involve text and data of all sorts.
Regular expressions are not a tool in and of themselves, but are included as part of a larger utility. The classic example is grep. These days, regular expressions can be found everywhere, such as in:
- Scripting languages (including Perl, Tcl, awk, and Python)
- Editors (including Emacs, vi, and Nisus Writer)
- Programming environments (including Delphi and Visual C++)
While many of these tools originated on UNIX, they are now available for a wide variety of platforms, including DOS/Windows and MacOS, so you can use them in your home environment. Additionally, many favorite programming languages offer regular-expression libraries, so you can include support for them in your own programs, and yes, even applets.
There can be certain subtle, but valuable, ways to think when you're using regular expressions, and these can be taught. Jeffrey Friedl has spent years helping people on the Net understand and use regular expressions. In this book he leads you through the steps of knowing exactly how to craft a regular expression to get the job done.
Regular expressions are not used in a vacuum. In this book, a variety of tools are examined and used in an extensive array of examples, with a major focus on Perl. Perl is extremely well endowed with rich and expressive regular expressions. Yet what is power in the hands of an expert can be fraught with peril for the unwary. This book will help you navigate the minefield to becoming an expert.
From the Publisher
From the Author
My book is all about using regular expressions to access and modify text and data. If you use Perl, Python, Emacs, awk, vi, Tcl, grep, etc., you'll find immediate benefit. If you have access to these or other programs that provide regular expression support, you'll probably benefit even more, as the book will open up a whole new world of power to you.
The book's home page is:
http://enterprise.ic.gc.ca/~jfriedl/regex/
You'll find the introduction, table of contents, and index online, among other things (the errata is also there, but as of yet there are no major boofoos found).
The response from readers so far has been extremely gratifying. If you get a chance to read it, I'd love to hear your thoughts!
About the Author
Excerpted from Mastering Regular Expressions by Jeffrey E.F. Friedl. Copyright © 1997. Reprinted by permission. All rights reserved.
Now that we have some background under our belt, let's delve into the mechanics of how a regex engine really goes about its work. Here we don't care much about the Shine and Finish of the previous chapter; this chapter is all about the engine and the drive train, the stuff that grease monkeys talk about in bars. We'll spend a fair amount of time under the hood, so expect to get a bit dirty with some practical hands-on experience.
Start Your Engines!
Let's see how much I can milk this engine analogy for. The whole point of having an engine is so that you can get from Point A to Point B without doing much work. The engine does the work for you so you can relax and enjoy the Rich Corinthian Leather. The engine's primary task is to turn the wheels, and how it does that isn't really a concern of yours. Or is it?
Two Kinds of Engines
Well, what if you had an electric car? They've been around for a long time, but they aren't as common as gas cars because they're hard to design well. If you had one, though, you would have to remember not to put gas in it. If you had a gasoline engine, well, watch out for sparks! An electric engine more or less just runs, but a gas engine might need some babysitting. You can get much better performance just by changing little things like your spark plug gaps, air filter, or brand of gas. Do it wrong and the engine's performance deteriorates, or, worse yet, it stalls.
Each engine might do its work differently, but the end result is that the wheels turn. You still have to steer properly if you want to get anywhere, but that's an entirely different issue.
New Standards
Let's stoke the fire by adding another variable: the California Emissions Standards.1 Some engines adhere to California's strict pollution standards, and some engines don't. These aren't really different kinds of engines, just new variations on what's already around. The standard regulates a result of the engine's work, the emissions, but doesn't say one thing or the other about how the engine should go about achieving those cleaner results. So, our two classes of engine are divided into four types: electric (adhering and non-adhering) and gasoline (adhering and non-adhering).
1 California has rather strict standards regulating the amount of pollution a car can produce. Because of this, many cars sold in America come in "for California'' and "non-California'' models.
Come to think of it, I bet that an electric engine can qualify for the standard without much change, so it's not really impacted very much -- the standard just "blesses'' the clean results that are already par for the course. The gas engine, on the other hand, needs some major tweaking and a bit of re-tooling before it can qualify. Owners of this kind of engine need to pay particular care to what they feed it -- use the wrong kind of gas and you're in big trouble in more ways than one.
The impact of standards
Better pollution standards are a good thing, but they require that the driver exercise more thought and foresight (well, at least for gas engines, as I noted in the previous paragraph). Frankly, however, the standard doesn't impact most people since all the other states still do their own thing and don't follow California's standard... yet. It's probably just a matter of time.
Okay, so you realize that these four types of engines can be classified into three groups (the two kinds for gas, and electric in general). You know about the differences, and that in the end they all still turn the wheels. What you don't know is what the heck this has to do with regular expressions!
More than you might imagine.
Regex Engine Types
There are two fundamentally different types of regex engines: one called "DFA'' (the electric engine of our story) and one called "NFA'' (the gas engine). The details follow shortly (=>101), but for now just consider them names, like Bill and Ted. Or electric and gas.
Both engine types have been around for a long time, but like its gasoline counterpart, the NFA type seems to be used more often. Tools that use an NFA engine include Tcl, Perl, Python, GNU Emacs, ed, sed, vi, most versions of grep, and even a few versions of egrep and awk. On the other hand, a DFA engine is found in almost all versions of egrep and awk, as well as lex and flex. Table 4-1 lists a few common programs available for a wide variety of platforms and the regex engine that most versions use. A generic version means that it's an old tool with many clones -- I have listed notably different clones that I'm aware of.2 Where I could find them, I used comments in the source code to identify the author (or, for the generic tools, the original author). I relied heavily on Libes and Ressler's Life With Unix (Prentice Hall, 1989) to fill in the gaps.
As Chapter 3 illustrated, 20 years of development with both DFAs and NFAs resulted in a lot of needless variety. Things were dirty. The POSIX standard came in to clean things up by specifying clearly which metacharacters an engine should support, and exactly the results you could expect from them. Superficial details aside, the DFAs (our electric engines) were already well suited to adhere to the standard, but the kind of results an NFA traditionally provided were quite different from the new standard, so changes were needed. As a result, broadly speaking, we have three types of regex engines:
· DFA (POSIX or not -- similar either way)
· Traditional NFA
· POSIX NFA