- Buy this product and stream 90 days of Amazon Music Unlimited for free. E-mail after purchase. Conditions apply. Learn more
Perl & LWP Paperback – 30 Jun 2002
- Choose from over 13,000 locations across the UK
- Prime members get unlimited deliveries at no additional cost
- Find your preferred location and add it to your address book
- Dispatch to this address when you check out
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Would you like to tell us about a lower price?
If you are a seller for this product, would you like to suggest updates through seller support?
Perl and LWP explains how to write programs that browse the Web, using the excellent Library for the World Wide Web or LWP. It is aimed at developers who already know both Perl and HTML, although you don't need to be an expert in either.
The fascination of this topic is that it makes you see the Web in a different way, not as a set of pages for users to browse, but as a huge database for your programs to explore. The most robust technique for querying Web sites programmatically is through XML Web Services, but this approach is in its infancy. LWP takes a different route, called screen-scraping. In essence, your Perl code pretends to be a browser and grabs HTML for processing. Using LWP you could write a command-line program to search your favourite auction site, fetch news headlines, or check multiple retail sites for the best prices. As the author acknowledges, the problem with screen-scraping is its brittleness: if the target Web site adopts a new look, it breaks your code. There are also interesting fair usage issues. Even so, it's a powerful technique with many possible applications. This clear and concise guide comes complete with typically terse Perl code examples. Topics include LWP basics, posting form data, processing results with regular expressions, using trees to process HTML, imitating different browser types, and supporting cookies programmatically. An appendix offers handy information like HTTP status codes, character tables, and MIME types. LWP is large, but while this title does not attempt to cover all the modules, it does provide all you need to start coding your own Web-mining programs.--Tim Anderson
Salted with plenty of examples, the book covers the whole process of navigating HTTP, downloading content, and parsing it into something usable. -- Rick Wayne, Software Development, September 2002
Solid, no-nonsense book that will teach you how to do screen-scraping using Perl. -- MIR, slashdot.org, August 19, 2002
The indispensable guide to learning LWP and using it effectively. -- Netsurfer Digest, Feb 14, 2003
This volume on Perl and LWP covers topics including: understanding LWP and its design; fetching and analyzing URLs; extracting information from HTML using regular expressions and tokens; working with the structure of HTML documents using trees; setting and inspecting HTTP headers and response codes; managing cookies; accessing information that requires authentication; extracting links; cooperating with proxy caches; and writing Web spiders (also known as robots) in a safe fashion. It also includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from Websites such as BBC News, Altavista, ABEBooks.com, and the Weather Underground, are explained in detail. The guide shows how to make Web requests, submit forms, and even provide authentication information, and it demonstrates using regular expressions, tokens, and trees to parse HTML.
From the Publisher
The LWP (Library for WWW in Perl) suite of modules lets your programs download and extract information from the Web. Perl & LWP shows how to make web requests, submit forms, and even provide authentication information, and it demonstrates using regular expressions, tokens, and trees to parse HTML.. This book is a must have for Perl programmers who want to automate and mine the Web.
About the Author
Sean Burke is an active member in the Perl community and one of CPAN's most prolific module authors. He has been a columnist for The Perl Journal since 1998, and is an authority on markup languages. Trained as a linguist, he also develops tools for software internationalization and Native language preservation.
4 customer reviews
There was a problem filtering reviews right now. Please try again later.
The content starts excellently showing how to use LWP to retrieve internet content.
However hardly any of the examples work anymore and there is no real update to this book, which can be very annoying.
Web development has moved on significantly and this book may have once been excellent but in today's world its not all that.
I would steer clear, the content needs a major revamp and some - it has the potential to be a great book again - All I can say is that it's it's a shame.
There must be other books you could buy instead.
Naturally, I was impressed by the simple, consistent treatment of examples: inspect source and find the interesting bits, code things up and then enhance to suit. :-)
A particularly satisfying thing to me is the sane way of working, that the author assumes. So many people seem to just bungle their way through web programming while ignoring basics like the robots.txt file. This book helps to prevent this.
One would think that only a thick tome would be sufficient to cover such vast territory, but the author (who is an active LWP module developer) does a fabulous job covering this extensive subject matter.
I recommend this book both to anyone starting out on their way to working with the underside of the web and to accomplished professionals in need of a full reference manual.
Of course it's an O'Reilly title so the attractive layout, typography and attention to detail goes without saying. I would heartily recommend this book to anyone who wants to automate the extraction of data from the web - follow Sean's guidance and you'll be productive sooner than you thought possible.
Most helpful customer reviews on Amazon.com
It is worth noting that in 2007, the book's author, Sean Burke, published the text of the book on his personal website at [...]. If you're thinking of purchasing the Kindle edition of this book (like I ended up doing), you may be better off using his site. Clearly, if you want a physical copy of the book Amazon is still a great way to go.