Buy Used
+ £2.80 UK delivery
Used: Very Good | Details
Sold by agservices
Condition: Used: Very Good
Comment: New, unused ex-display copy; crease to back cover only
Trade in your item
Get a £3.25
Gift Card.
Have one to sell?
Flip to back Flip to front
Listen Playing... Paused   You're listening to a sample of the Audible audio edition.
Learn more
See all 2 images

Hadoop: The Definitive Guide Paperback – 15 Oct 2010

See all formats and editions Hide other formats and editions
Amazon Price New from Used from
"Please retry"
£37.16 £15.00

There is a newer edition of this item:

Hadoop: The Definitive Guide
This title has not yet been released.

Trade In this Item for up to £3.25
Trade in Hadoop: The Definitive Guide for an Amazon Gift Card of up to £3.25, which you can then spend on millions of items across the site. Trade-in values may vary (terms apply). Learn more

Product details

  • Paperback: 628 pages
  • Publisher: Yahoo Press; 2 edition (15 Oct. 2010)
  • Language: English
  • ISBN-10: 1449389732
  • ISBN-13: 978-1449389734
  • Product Dimensions: 17.8 x 3.8 x 23.3 cm
  • Average Customer Review: 4.0 out of 5 stars  See all reviews (4 customer reviews)
  • Amazon Bestsellers Rank: 625,480 in Books (See Top 100 in Books)
  • See Complete Table of Contents

More About the Author

Discover books, learn about writers, and more.

Product Description

About the Author

Tom White has been an Apache Hadoop committer since February 2007, and is a member of the Apache Software Foundation. He works for Cloudera, a company set up to offer Hadoop support and training. Previously he was as an independent Hadoop consultant, working with companies to set up, use, and extend Hadoop. He has written numerous articles for O'Reilly, and IBM's developerWorks, and has spoken at several conferences, including at ApacheCon 2008 on Hadoop. Tom has a Bachelor's degree in Mathematics from the University of Cambridge and a Master's in Philosophy of Science from the University of Leeds, UK.

Inside This Book (Learn More)
Browse Sample Pages
Front Cover | Copyright | Table of Contents | Excerpt | Index | Back Cover
Search inside this book:

What Other Items Do Customers Buy After Viewing This Item?

Customer Reviews

4.0 out of 5 stars
Share your thoughts with other customers

Most Helpful Customer Reviews

2 of 2 people found the following review helpful By MrZipf on 24 Oct. 2011
Format: Paperback Verified Purchase
This book is well organized and spans the Hadoop stack. As a relative newcomer to Hadoop, I'd already read many of the online docs and surveyed some of the source code, but having this book clarified some issues and questions that had arisen. It's a useful reference if you are working with Hadoop or interested in the Hadoop stack.
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
2 of 2 people found the following review helpful By rlaenen on 12 Feb. 2012
Format: Paperback
The book gives a decent overview of the Hadoop software.
Unfortunately, all example code is based on an old version of the Hadoop API (even in this second edition).
2 Comments Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
1 of 1 people found the following review helpful By Mr. Yasin Mustafa on 23 Feb. 2012
Format: Paperback
If like me, you are a developer, and want a book that focuses largely on the coding aspect of hadoop then this is the book for you e.g MapReduce, interfacing with hdfs, API's (old and new).

I found it to be an easy simple read with many examples. I was able to get through half the book in a day whilst doing some practical work alongside it.
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again
0 of 1 people found the following review helpful By S W SCOTT on 17 Feb. 2013
Format: Paperback Verified Purchase
As a new comer to Hadoop I found this book really useful in helping me get to grips with this new way of thinking about non-relational database. (I've developed several commercial applications with Hadoop now-all of which started from this book)
Comment Was this review helpful to you? Yes No Sending feedback...
Thank you for your feedback. If this review is inappropriate, please let us know.
Sorry, we failed to record your vote. Please try again

Most Helpful Customer Reviews on (beta) 17 reviews
25 of 27 people found the following review helpful
The canonical reference of all things Hadoop 14 Jun. 2011
By Eric Sammer - Published on
Format: Paperback
The second edition of the already fantastic Hadoop: The Definitive Guide adds the last few missing bits to the best Hadoop reference out there.

For those not familiar with the first edition, Hadoop: The Definitive Guide is exactly what it claims to be. If you're not already familiar with Hadoop, the first and second chapters (Meet Hadoop and MapReduce, respectively) take you through the basics in both concept as well as code. For those used to writing data processing applications, the rationale behind Hadoop and why it's useful are immediately apparent. If you've already been exposed to Hadoop, these chapters may be redundant but they're worth reading anyway the first time through.

The chapter on HDFS does a great job at explaining the underbelly of Hadoop's distributed file system including the Java APIs. The section on Hadoop IO is probably introduced a bit too early - Hadoop newbies probably don't care about compression and serialization prior to reading about map reduce - but excellent none the less in its detail. That said, you'll *really* want to go back and read it to understand the details of how compression codecs work after you learn more about map reduce.The "Writing a Map Reduce Application" chapter is probably the one existing users of Hadoop will skip. First timers will definitely get a lot out of a step by step walk through of a Java MR job from beginning to end.

The chapters on how map reduce works, types and formats (including input / output format details), and the advanced features (counters, sorting, the distributed cache, join libraries) are the ones you'll reread and reference constantly. The explanation, for instance, on how input splits are calculated demystifies the border between HDFS and the map reduce layer (and finally answers the question of "how does Hadoop know not to split in the middle of a record?"). Buy this book for these chapters, if not for the others.

The chapters on HBase, Pig, ZooKeeper, and Sqoop are excellent and, in some cases, the best reference on the topic to date.

There are enough corrections, updates, and new chapters that it's worth buying the second edition if you already have the first. For anyone new to Hadoop this is a must have. If you already use Hadoop the later chapters are what you're looking for; a deep explanation of not just "how," but "why."

Some reviewers have noted the discussion of deprecated APIs. This really isn't a flaw of the book, but of premature deprecation within Hadoop itself. The newer APIs didn't have all the features of the old and anyone writing production map reduce jobs would wind up needing a lot of those features. I think the author does a great job with a tough situation while still alerting the reader that newer APIs are on the horizon. Besides, the differences are so few that it's almost not worth mentioning. While APIs may change, the core design, execution model, and architecture of Hadoop haven't changed and this is the best book on the subject.
27 of 31 people found the following review helpful
Sadly, already outdated 23 May 2011
By L. Wickland - Published on
Format: Paperback
Hadoop's MapReduce and HBase went through a major API change right around the time this book was finishing up. Consequently, if you try to use the examples in the book as a guide while developing against either the Apache Hadoop latest release or against Cloudera's CDH3, you'll find a mountain of frustration in the form of deprecated or entirely deleted classes.
11 of 12 people found the following review helpful
Excellant Hadoop Overview 21 July 2011
By David Mark Schramm - Published on
Format: Paperback Verified Purchase
This book provides an excellent in-depth overview of all aspects of Hadoop with how-to examples that are easy to follow. It is well written, thorough and exactly what I needed to architect and build a Hadoop-based solution. Related technologies such as Hive, HBase, Sqoop, Pig and Zookeeper are also covered in decent depth.

Other reviewers gave poor reviews due to the APIs being not up to date, which I think is unfair. Those new APIs are still only available in early unstable Hadoop versions, so current developers are best served to use the earlier APIs. The book gives samples with new APIs and shows very clearly the API changes which are minor. The concepts are identical, but a few classes have been combined into a more cohesive "Context" class in the new APIs.

So, for example, to write a data record you call "context.collect(...);" rather than "output.collect(...);" with identical parameters. The structure of applications and the concepts are not changed. The changes to the syntax of Java calls is trivial and covered in the book very clearly. What is the big deal? Understanding the concepts is the most important thing and this book provides this very nicely.

I would recommend this book to anyone who is new to Hadoop and needs to learn it in depth.
27 of 37 people found the following review helpful
Outdated by the Time it hit the shelf 18 Nov. 2010
By Peter Harrington - Published on
Format: Paperback Verified Purchase
The APIs in this book were all outdated by the time the book hit the shelf. The authors did recognize this and mention it in the book, however you don't need 400 pages to understand the map-reduce concepts.
I think it's a bad idea trying to publish a book on a rapidly changing community project like Hadoop. I found the Cloudera (free) training materials much more helpful.
1 of 1 people found the following review helpful
In-depth with lots of examples 8 Jun. 2012
By JUAN JOSE DE LEON - Published on
Format: Paperback
The book has lots examples and footnote resources that enriched the content. Some people recommend watching Cloudera training videos first and then reading this book if you are a beginner, and i agree.
Were these reviews helpful? Let us know