Mastering Regular Expressions and over one million other books are available for Amazon Kindle . Learn more

Have one to sell? Sell yours here
or
Get a £0.25 Amazon.co.uk Gift Card
Mastering Regular Expressions: Powerful Techniques for Perl and Other Tools (Nutshell Handbook)
 
 
Start reading Mastering Regular Expressions on your Kindle in under a minute.

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Mastering Regular Expressions: Powerful Techniques for Perl and Other Tools (Nutshell Handbook) [Paperback]

Jeffrey E.F. Friedl
4.8 out of 5 stars  See all reviews (31 customer reviews)

Available from these sellers.


‹  Return to Product Overview

Product Description

Amazon.co.uk Review

Regular expressions--it sounds fairly ordinary in a regular sort of way, so therefore it must be very simple and very straightforward, right? Not quite.

The simple name hides an incredible amount of power when handling textual data with scripting languages such as Perl, Python and awk and more and the programmer that can master regular expressions can master just about anything.

From the off it's necessary to congratulate author Jeffrey Friedl on doing a superb job of asking what can be a very complex subject and breaking it down into digestible chunks that almost anyone can understand.

From the basics of character and pattern matching through to the recognition of complex string patterns and multiple character replacements to "greedy" metacharacters and how to curb their appetite, this is about as comprehensive as it gets.

With a handful of latter chapters devoted to the differences between scripting languages and the way in which they deal with regular expressions and so many examples it'll make your eyes water there's something here for everyone.

So, if you can examine a string like this "(\\.|[^"\\])*" and know what it does and how it does it there's plenty of reference material in here for those odd moments when you need a refresher. If, however, you've no idea what the above means, and you need the ability to handle textual data, buy this book. Now!

David Wller, Java Developers Journal Jan 2002

This book contains everything
you need to know bout the mysterious regex

Major Keary, Book News, 2002 No 5

As the only reference to the art, it should be on the bookshelf of every programmer and anyone who works with large text files.

James Lance, Provo Linux User Group, May 2002

If you have ever used a regular expression or ever wanted to use one, this is the book for you!

Leo LaPorte, TechTV, July 16, 2002

There's no better way to learn how to use regular expressions than with Jeffrey Friedl's fine book. --This text refers to an out of print or unavailable edition of this title.

Kevin Taylor, Northants Linux Users Group, August 2002

The author really knows his subject and presents what could have been quite a dry subject in a very readable way. --This text refers to an out of print or unavailable edition of this title.

Huw Collingbourne, PC Plus, March 2003

This a valuable book for anyone who needs to master regular expressions. --This text refers to an out of print or unavailable edition of this title.

Jim Secan, The Journal of the Tucson Computer Society, Jan 2003

If you use tools like grep, perl, & procmail daily, this book is a good investment both in the money spent and the time working your way through it. --This text refers to an out of print or unavailable edition of this title.

Jason Menard, javaranch.com, March 2003

The author does an outstanding job leading the reader from regex novice to master. --This text refers to an out of print or unavailable edition of this title.

Product Description

Regular expressions are a powerful tool for manipulating text and data. If you don't use them yet, you will discover in this book a whole new world of mastery over your data. If you already use them, you'll appreciate this book's unprecedented detail and breadth of coverage. If you think you know all you need to know about regular expressions, this book is a stunning eye-opener.

With regular expressions, you can save yourself time and aggravation while dealing with documents, mail messages, log files -- you name it -- any type of text or data. For example, regular expressions can play a vital role in constructing a World Wide Web CGI script, which can involve text and data of all sorts.

Regular expressions are not a tool in and of themselves, but are included as part of a larger utility. The classic example is grep. These days, regular expressions can be found everywhere, such as in:

  • Scripting languages (including Perl, Tcl, awk, and Python)
  • Editors (including Emacs, vi, and Nisus Writer)
  • Programming environments (including Delphi and Visual C++)

While many of these tools originated on UNIX, they are now available for a wide variety of platforms, including DOS/Windows and MacOS, so you can use them in your home environment. Additionally, many favorite programming languages offer regular-expression libraries, so you can include support for them in your own programs, and yes, even applets.

There can be certain subtle, but valuable, ways to think when you're using regular expressions, and these can be taught. Jeffrey Friedl has spent years helping people on the Net understand and use regular expressions. In this book he leads you through the steps of knowing exactly how to craft a regular expression to get the job done.

Regular expressions are not used in a vacuum. In this book, a variety of tools are examined and used in an extensive array of examples, with a major focus on Perl. Perl is extremely well endowed with rich and expressive regular expressions. Yet what is power in the hands of an expert can be fraught with peril for the unwary. This book will help you navigate the minefield to becoming an expert.

From the Publisher

Regular expressions, a powerful tool for manipulating text and data, are found in scripting languages, editors, programming environments, and specialized tools. In this book, author Jeffrey Friedl leads you through the steps of crafting a regular expression that gets the job done. He examines a variety of tools and uses them in an extensive array of examples, with a major focus on Perl.

From the Author

For everyone from nubie to expert: control your data
My book is all about using regular expressions to access and modify text and data. If you use Perl, Python, Emacs, awk, vi, Tcl, grep, etc., you'll find immediate benefit. If you have access to these or other programs that provide regular expression support, you'll probably benefit even more, as the book will open up a whole new world of power to you.

The book's home page is:
http://enterprise.ic.gc.ca/~jfriedl/regex/

You'll find the introduction, table of contents, and index online, among other things (the errata is also there, but as of yet there are no major boofoos found).

The response from readers so far has been extremely gratifying. If you get a chance to read it, I'd love to hear your thoughts!

About the Author

Jeffrey Friedl was raised in the countryside of Rootstown, Ohio, and had aspirations of being an astronomer until one day noticing a TRS-80 Model I sitting unused in the corner of the chem lab (bristling with a full 16k RAM, no less). He eventually began using UNIX (and regular expressions) in 1980. With degrees in computer science from Kent (B.S.) and the University of New Hampshire (M.S.), he is now an engineer with Omron Corporation, Kyoto, Japan. He lives in Nagaokakyou-city with Tubby, his long-time friend and Teddy Bear, in a tiny apartment designed for a (Japanese) family of three. Jeffrey applies his regular-expression know-how to make the world a safer place for those not bilingual in English and Japanese. He built and maintains the World Wide Web Japanese-English dictionary server, http://www.itc.omron.com/cgi-bin/j-e, and is active in a variety of language-related projects, both in print and on the Web. When faced with the daunting task of filling his copious free time, Jeffrey enjoys riding through the mountainous countryside of Japan on his Honda CB-1. At the age of 30, he finally decided to put his 6'4" height to some use, and joined the Omron company basketball team. While finalizing the manuscript for Mastering Regular Expressions., he took time out to appear in his first game, scoring five points in nine minutes of play, which he feels is pretty darn good for a geek. When visiting his family in The States, Jeffrey enjoys dancing a two-step with his mom, binking old coins with his dad, and playing schoffkopf with his brothers and sisters.

Excerpted from Mastering Regular Expressions by Jeffrey E.F. Friedl. Copyright © 1997. Reprinted by permission. All rights reserved.

Chapter 4 - The Mechanics of Expression Processing

Now that we have some background under our belt, let's delve into the mechanics of how a regex engine really goes about its work. Here we don't care much about the Shine and Finish of the previous chapter; this chapter is all about the engine and the drive train, the stuff that grease monkeys talk about in bars. We'll spend a fair amount of time under the hood, so expect to get a bit dirty with some practical hands-on experience.

Start Your Engines!
Let's see how much I can milk this engine analogy for. The whole point of having an engine is so that you can get from Point A to Point B without doing much work. The engine does the work for you so you can relax and enjoy the Rich Corinthian Leather. The engine's primary task is to turn the wheels, and how it does that isn't really a concern of yours. Or is it?

Two Kinds of Engines
Well, what if you had an electric car? They've been around for a long time, but they aren't as common as gas cars because they're hard to design well. If you had one, though, you would have to remember not to put gas in it. If you had a gasoline engine, well, watch out for sparks! An electric engine more or less just runs, but a gas engine might need some babysitting. You can get much better performance just by changing little things like your spark plug gaps, air filter, or brand of gas. Do it wrong and the engine's performance deteriorates, or, worse yet, it stalls.

Each engine might do its work differently, but the end result is that the wheels turn. You still have to steer properly if you want to get anywhere, but that's an entirely different issue.

New Standards
Let's stoke the fire by adding another variable: the California Emissions Standards.1 Some engines adhere to California's strict pollution standards, and some engines don't. These aren't really different kinds of engines, just new variations on what's already around. The standard regulates a result of the engine's work, the emissions, but doesn't say one thing or the other about how the engine should go about achieving those cleaner results. So, our two classes of engine are divided into four types: electric (adhering and non-adhering) and gasoline (adhering and non-adhering).

1 California has rather strict standards regulating the amount of pollution a car can produce. Because of this, many cars sold in America come in "for California'' and "non-California'' models.

Come to think of it, I bet that an electric engine can qualify for the standard without much change, so it's not really impacted very much -- the standard just "blesses'' the clean results that are already par for the course. The gas engine, on the other hand, needs some major tweaking and a bit of re-tooling before it can qualify. Owners of this kind of engine need to pay particular care to what they feed it -- use the wrong kind of gas and you're in big trouble in more ways than one.

The impact of standards
Better pollution standards are a good thing, but they require that the driver exercise more thought and foresight (well, at least for gas engines, as I noted in the previous paragraph). Frankly, however, the standard doesn't impact most people since all the other states still do their own thing and don't follow California's standard... yet. It's probably just a matter of time.

Okay, so you realize that these four types of engines can be classified into three groups (the two kinds for gas, and electric in general). You know about the differences, and that in the end they all still turn the wheels. What you don't know is what the heck this has to do with regular expressions!

More than you might imagine.
Regex Engine Types
There are two fundamentally different types of regex engines: one called "DFA'' (the electric engine of our story) and one called "NFA'' (the gas engine). The details follow shortly (=>101), but for now just consider them names, like Bill and Ted. Or electric and gas.
Both engine types have been around for a long time, but like its gasoline counterpart, the NFA type seems to be used more often. Tools that use an NFA engine include Tcl, Perl, Python, GNU Emacs, ed, sed, vi, most versions of grep, and even a few versions of egrep and awk. On the other hand, a DFA engine is found in almost all versions of egrep and awk, as well as lex and flex. Table 4-1 lists a few common programs available for a wide variety of platforms and the regex engine that most versions use. A generic version means that it's an old tool with many clones -- I have listed notably different clones that I'm aware of.2 Where I could find them, I used comments in the source code to identify the author (or, for the generic tools, the original author). I relied heavily on Libes and Ressler's Life With Unix (Prentice Hall, 1989) to fill in the gaps.

As Chapter 3 illustrated, 20 years of development with both DFAs and NFAs resulted in a lot of needless variety. Things were dirty. The POSIX standard came in to clean things up by specifying clearly which metacharacters an engine should support, and exactly the results you could expect from them. Superficial details aside, the DFAs (our electric engines) were already well suited to adhere to the standard, but the kind of results an NFA traditionally provided were quite different from the new standard, so changes were needed. As a result, broadly speaking, we have three types of regex engines:

· DFA (POSIX or not -- similar either way)
· Traditional NFA
· POSIX NFA

‹  Return to Product Overview