Perl & LWP and over 2 million other books are available for Amazon Kindle . Learn more


or
Sign in to turn on 1-Click ordering.
Trade in Yours
For a 0.25 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Sorry, this item is not available in
Image not available for
Colour:
Image not available

 
Start reading Perl & LWP on your Kindle in under a minute.

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Perl & LWP [Paperback]

Sean M. Burke
5.0 out of 5 stars  See all reviews (3 customer reviews)
Price: 25.99 & FREE Delivery in the UK. Details
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Only 2 left in stock (more on the way).
Dispatched from and sold by Amazon. Gift-wrap available.
Want it tomorrow, 29 July? Choose Express delivery at checkout. Details

Formats

Amazon Price New from Used from
Kindle Edition 20.08  
Paperback 25.99  
Trade In this Item for up to 0.25
Trade in Perl & LWP for an Amazon Gift Card of up to 0.25, which you can then spend on millions of items across the site. Trade-in values may vary (terms apply). Learn more

Book Description

30 Jun 2002 0596001789 978-0596001780 1

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages.The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling, reading or writing, uploading or downloading, news to e-commerce, can be controlled with Perl and LWP. You can automate Web-based purchase orders as easily as you can set up a program to download MP3 files from a web site.Perl & LWP covers:

  • Understanding LWP and its design
  • Fetching and analyzing URLs
  • Extracting information from HTML using regular expressions and tokens
  • Working with the structure of HTML documents using trees
  • Setting and inspecting HTTP headers and response codes
  • Managing cookies
  • Accessing information that requires authentication
  • Extracting links
  • Cooperating with proxy caches
  • Writing web spiders (also known as robots) in a safe fashion
Perl & LWP includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from the web sites of BBC News, Altavista, ABEBooks.com, and the Weather Underground, to name just a few, are explained in detail, so that you understand how and why they work.Perl programmers who want to automate and mine the web can pick up this book and be immediately productive. Written by a contributor to LWP, and with a foreword by one of LWP's creators, Perl & LWP is the authoritative guide to this powerful and popular toolkit.

Frequently Bought Together

Perl & LWP + Mastering Algorithms with Perl + Programming the Perl DBI: Database programming with Perl
Price For All Three: 76.32

Buy the selected items together


Product details

  • Paperback: 262 pages
  • Publisher: O'Reilly Media; 1 edition (30 Jun 2002)
  • Language: English
  • ISBN-10: 0596001789
  • ISBN-13: 978-0596001780
  • Product Dimensions: 23.4 x 18 x 1.7 cm
  • Average Customer Review: 5.0 out of 5 stars  See all reviews (3 customer reviews)
  • Amazon Bestsellers Rank: 593,192 in Books (See Top 100 in Books)
  • See Complete Table of Contents

More About the Author

Discover books, learn about writers, and more.

Product Description

Amazon Review

Perl and LWP explains how to write programs that browse the Web, using the excellent Library for the World Wide Web or LWP. It is aimed at developers who already know both Perl and HTML, although you don't need to be an expert in either.

The fascination of this topic is that it makes you see the Web in a different way, not as a set of pages for users to browse, but as a huge database for your programs to explore. The most robust technique for querying Web sites programmatically is through XML Web Services, but this approach is in its infancy. LWP takes a different route, called screen-scraping. In essence, your Perl code pretends to be a browser and grabs HTML for processing. Using LWP you could write a command-line program to search your favourite auction site, fetch news headlines, or check multiple retail sites for the best prices. As the author acknowledges, the problem with screen-scraping is its brittleness: if the target Web site adopts a new look, it breaks your code. There are also interesting fair usage issues. Even so, it's a powerful technique with many possible applications. This clear and concise guide comes complete with typically terse Perl code examples. Topics include LWP basics, posting form data, processing results with regular expressions, using trees to process HTML, imitating different browser types, and supporting cookies programmatically. An appendix offers handy information like HTTP status codes, character tables, and MIME types. LWP is large, but while this title does not attempt to cover all the modules, it does provide all you need to start coding your own Web-mining programs.--Tim Anderson

Review

Salted with plenty of examples, the book covers the whole process of navigating HTTP, downloading content, and parsing it into something usable. -- Rick Wayne, Software Development, September 2002

Solid, no-nonsense book that will teach you how to do screen-scraping using Perl. -- MIR, slashdot.org, August 19, 2002

The indispensable guide to learning LWP and using it effectively. -- Netsurfer Digest, Feb 14, 2003

Inside This Book (Learn More)
Browse Sample Pages
Front Cover | Copyright | Table of Contents | Excerpt | Index | Back Cover
Search inside this book:

What Other Items Do Customers Buy After Viewing This Item?


Customer Reviews

4 star
0
3 star
0
2 star
0
1 star
0
5.0 out of 5 stars
5.0 out of 5 stars
Most Helpful Customer Reviews
6 of 6 people found the following review helpful
5.0 out of 5 stars Fabulous book! 31 Aug 2002
Format:Paperback
This book is a comprehensive and authoritative guide to web automation. It reads as both a gentle tutorial and a well organized reference. Basic HTTP operation, regexp HTML parsing, tokenizing, cookie authentication, form handling, and robot spidering are covered extensively in numerous case studies and practical examples.
Naturally, I was impressed by the simple, consistent treatment of examples: inspect source and find the interesting bits, code things up and then enhance to suit. :-)
A particularly satisfying thing to me is the sane way of working, that the author assumes. So many people seem to just bungle their way through web programming while ignoring basics like the robots.txt file. This book helps to prevent this.
One would think that only a thick tome would be sufficient to cover such vast territory, but the author (who is an active LWP module developer) does a fabulous job covering this extensive subject matter.
I recommend this book both to anyone starting out on their way to working with the underside of the web and to accomplished professionals in need of a full reference manual.
Comment | 
Was this review helpful to you?
8 of 9 people found the following review helpful
5.0 out of 5 stars Does exactly what it says on the tin! 11 Aug 2002
Format:Paperback
A satisfyingly short (242p) book that covers its subject perfectly. The examples are well written and explained and are ideal for using as a starting point for your own work. Within 15 minutes I had written a script to fetch pages of football results from a web site, process the data and produce files for uploading to my database. Previous I did this by downloading the html and editing it by hand - automating it will save me about 30 minutes a week.
Of course it's an O'Reilly title so the attractive layout, typography and attention to detail goes without saying. I would heartily recommend this book to anyone who wants to automate the extraction of data from the web - follow Sean's guidance and you'll be productive sooner than you thought possible.
Comment | 
Was this review helpful to you?
0 of 1 people found the following review helpful
5.0 out of 5 stars Does exactly what it says on the tin 11 Aug 2002
Format:Paperback
A satisfyingly short (242p) book that covers its subject perfectly. The examples are well written and explained and are ideal for using as a starting point for your own work. Within 15 minutes I had written a script to fetch pages of football results from a web site, process the data and produce files for uploading to my database. Previous I did this by downloading the html and editing it by hand - automating it will save me about 30 minutes a week.
Of course it's an O'Reilly title so the attractive layout, typography and attention to detail goes without saying. I would heartily recommend this book to anyone who wants to automate the extraction of data from the web - follow Sean's guidance and you'll be productive sooner than you thought possible.
Comment | 
Was this review helpful to you?
Most Helpful Customer Reviews on Amazon.com (beta)
Amazon.com: 4.6 out of 5 stars  12 reviews
18 of 18 people found the following review helpful
5.0 out of 5 stars Excellent coverage of LWP, packed full of useful examples 16 July 2002
By Amazon Customer - Published on Amazon.com
Format:Paperback
I was definitely interested when I first heard that O'Reilly were publishing a book on LWP. LWP is a definitive collection of perl modules covering everything you could think of doing with URIs, HTML, and HTTP. While 'web services' are the buzzword friendly technology of the day, sometimes you need to roll your sleeves up and get a bit dirty scraping screens and hacking at HTML. For such a deep subject, this book weighs in at a slim 242 pages. This is a very good thing. I'm far too busy to read these massive shelf-destroying tomes that seem to be churned out recently.
It covers everything you need to know with concise examples, which is what makes this book really shine. You start with the basics using LWP::Simple through to more advanced topics using LWP::UserAgent, HTTP::Cookies, and WWW::RobotRules. Sean shows finger saving tips and shortcuts that take you more than a couple notches above what you can learn from the lwpcook manpage, with enough depth to satisfy somebody who is an experienced LWP hacker.

This book is a great reference, just flick through and you'll find a relevant chapter with an example to save the day. Chapters include filling in forms and extracting data from HTML using regular expressions, then more advanced topics using HTML::TokeParser, and then my preferred tool, the author's own HTML::TreeBuilder. The book ends with a chapter on spidering, with excellent coverage of design and warnings to get your started on your web trawling.
15 of 15 people found the following review helpful
5.0 out of 5 stars This book can teach you expert-level web scraping/munging. 12 July 2003
By wickline - Published on Amazon.com
Format:Paperback
If you aren't yet comfortable using object-oriented Perl modules, the multitude of examples will at least allow you see how it's done even if you're a bit fuzzy on what's happening 'underneath' when you call object methods. If you're comfortable learning how to do something without knowing exactly why it works, then the author's clear step-by-step explantions and numerous progressively more powerful examples should make this book accessible even to relatively innexperienced Perl programmers.
More experienced programmers will understand better why things work, but any Perl programmer will set this book down feeling empowered to turn the web into their own valet. No longer do you need to check multiple sites looking for interesting information. Instead, you can readily author code to do that for you and alert you when items of interest are found. You can use these tools to free up personal time, to harvest information to inform business decisions, to automate tedious web application testing, and a zillion other things.
The author's clear exploration of the relevant Perl modules leaves the reader with a good depth of understanding of what these modules do, when you might want to use which module, and how to use them for real world tasks. Before reading the book, I knew of these modules, but they were a rather intimidating pile. I'd used a few of them on occasion for rather limited projects, but was reluctant to invest the time required to read all of the documentation from the whole collection. Mountains of method-level documentation do not a tutorial make. This book takes all of that information, selects the most important parts, and ensures that those parts are covered in progressively more powerful and/or flexible examples.
If you know Perl and you're sick of 'working the web' to get information and you want the web to work for you instead, then you need this book. I had a personal project that was on the back burner for a couple of years because it just sounded too hard. The weekend after I finished this book, I wrote what I had previously thought to be the hard part of that project and it was both easy and fun. This book makes hard things not just possible, but actually easy.
-matt
14 of 14 people found the following review helpful
5.0 out of 5 stars Very Informative and useful 8 Aug 2002
By "sherzodr" - Published on Amazon.com
Format:Paperback
As a web programmer, I had dealt with several such projects dealing with web automation and writing simple crawlers even before I read "Perl & LWP". The book was the first book I've read on the subject, and I'm by no means disappointed. The book is very well organized, very informative and nails the subject in the head. I am pleased.
I noticed some inaccuracies in the discussions, some chopped off paragraphs and sentences. But this doesn't affect the usability of the book much. Author Sean Burke does a great job in walking one through the most of the aspects of web automation and data extraction in the web using Perl and LWP (libwww in Perl ).
The codes the book gives are very well organized, well written and easily debugable. The steps are pretty consistent across all the examples:
a) Inspect the HTML source code of the page;
b) Determine the tokens and patterns of interest;
c) Write the first code;
d) Fine tune the code;
As usual, I'll be commenting on individual chapters to give you an idea of the
coverage of the book in more details...
9 of 9 people found the following review helpful
5.0 out of 5 stars Fabulous book! 31 Aug 2002
By K. Boggs - Published on Amazon.com
Format:Paperback
This book is a comprehensive and authoritative guide to web automation. It reads as both a gentle tutorial and a well organized reference. Basic HTTP operation, regexp HTML parsing, tokenizing, cookie authentication, form handling, and robot spidering are covered extensively in numerous case studies and practical examples.
Naturally, I was impressed by the simple, consistent treatment of examples: inspect source and find the interesting bits, code things up and then enhance to suit. :-)
A particularly satisfying thing to me is the sane way of working, that the author assumes. So many people seem to just bungle their way through web programming while ignoring basics like the robots.txt file. This book helps to prevent this.
One would think that only a thick tome would be sufficient to cover such vast territory, but the author (who is an active LWP module developer) does a fabulous job covering this extensive subject matter.
I recommend this book both to anyone starting out on their way to working with the underside of the web and to accomplished professionals in need of a full reference manual.
8 of 8 people found the following review helpful
5.0 out of 5 stars A must-read for exploiting the web in a GOOD way. 11 July 2002
By Kevin Healy - Published on Amazon.com
Format:Paperback
A great book for anyone who wishes to automate daily tasks on the web. Sean does an outstanding job of showing how Perl can be used to extract and manipulate not just data but useful information efficiently from the web's vast data resources. I've already adapted an example from this book (link-checking spider) for sites I maintain. Yes, I've known of the LWP module prior to this book. But as a lazy programmer, I rely on others to show me the way. Sean does just that...
Were these reviews helpful?   Let us know
Search Customer Reviews
Only search this product's reviews

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 

Search Customer Discussions
Search all Amazon discussions
   


Look for similar items by category


Feedback