| |||||||||||||||
![]() Trade In this Item for up to £1.55
Get an extra £5 when you trade in books worth £10 or more until June 30, 2012. Trade in Perl & LWP for an Amazon.co.uk gift card of up to £1.55, which you can then spend on millions of items across the site. Trade-in values may vary (terms apply). Find more products eligible for trade-in.
|
Product details
|
The fascination of this topic is that it makes you see the Web in a different way, not as a set of pages for users to browse, but as a huge database for your programs to explore. The most robust technique for querying Web sites programmatically is through XML Web Services, but this approach is in its infancy. LWP takes a different route, called screen-scraping. In essence, your Perl code pretends to be a browser and grabs HTML for processing. Using LWP you could write a command-line program to search your favourite auction site, fetch news headlines, or check multiple retail sites for the best prices. As the author acknowledges, the problem with screen-scraping is its brittleness: if the target Web site adopts a new look, it breaks your code. There are also interesting fair usage issues. Even so, it's a powerful technique with many possible applications. This clear and concise guide comes complete with typically terse Perl code examples. Topics include LWP basics, posting form data, processing results with regular expressions, using trees to process HTML, imitating different browser types, and supporting cookies programmatically. An appendix offers handy information like HTTP status codes, character tables, and MIME types. LWP is large, but while this title does not attempt to cover all the modules, it does provide all you need to start coding your own Web-mining programs.--Tim Anderson
Tag this product(What's this?)Think of a tag as a keyword or label you consider is strongly related to this product.
Tags will help all customers organise and find favourite items. |
Naturally, I was impressed by the simple, consistent treatment of examples: inspect source and find the interesting bits, code things up and then enhance to suit. :-)
A particularly satisfying thing to me is the sane way of working, that the author assumes. So many people seem to just bungle their way through web programming while ignoring basics like the robots.txt file. This book helps to prevent this.
One would think that only a thick tome would be sufficient to cover such vast territory, but the author (who is an active LWP module developer) does a fabulous job covering this extensive subject matter.
I recommend this book both to anyone starting out on their way to working with the underside of the web and to accomplished professionals in need of a full reference manual.
Of course it's an O'Reilly title so the attractive layout, typography and attention to detail goes without saying. I would heartily recommend this book to anyone who wants to automate the extraction of data from the web - follow Sean's guidance and you'll be productive sooner than you thought possible.
Of course it's an O'Reilly title so the attractive layout, typography and attention to detail goes without saying. I would heartily recommend this book to anyone who wants to automate the extraction of data from the web - follow Sean's guidance and you'll be productive sooner than you thought possible.
|