Customer Reviews

5.0 out of 5 stars
5 star
4 star
3 star
2 star
1 star

Your rating(Clear)Rate this item
Share your thoughts with other customers

There was a problem filtering reviews right now. Please try again later.

1 of 1 people found the following review helpful
This is a brilliant book.

The book has a few niggles in that I spotted some flaws in the code. The book and code download could do with some JUnit tests!

But I guess the author can be forgiven for this, because of the usefulness, clarity, commenting and detailed coverage of the code that he develops.

When all is said and done this is a recipe book of examples.

There is a complete implementation of an HTML parser. I was a bit surprised to see book wasn't using JTidy or NekoHTML here...

By the time you're done you'll have a great appreciation of HTTP and tools like WebShark to help for create a "bot" [ie bespoke screen scrapers designed to extract data from specific sites], as well as the more generic "spider".

You'll find this book a great resource for learning about concurrent programming in Java as you work through the code that makes up the highly configurable Heaton Research Spider. There are various implementations here. An in memory version for a single host site, and a couple of SQL based ones for MYSQL/Oracle..

The book also shows how to call into the Google search API's to create what it calls a "hybrid bot". You could then use this to setup the seed data for your spider. (Setting up this seed data is where you are left to your own devices and perhaps where book could have been expanded on slightly).

You'll also get a bit of exposure to AXIS web services and RSS feeds along the way too.

I'd thoroughly recommend this to anyone wanting to learn more about harvesting information from the web and expand their knowledge of Java, HTTP, and Multi-Threading/Concurrency.
0CommentWas this review helpful to you?YesNoSending feedback...
Thank you for your feedback.
Sorry, we failed to record your vote. Please try again
Report abuse

Send us feedback

How can we make Amazon Customer Reviews better for you?
Let us know here.

Sponsored Links

  (What is this?)
Tackling the Toughest Problems. Find Your Next Challenge. Join Us.