on 19 February 2013
I read this book while preparing for the CCDH exam. It's a solid all around book covering several different areas of the Hadoop ecosystem. However, it shouldn't be the first book someone reads about Hadoop. For that purpose maybe Hadoop in Action is a better fit. Once you get the high level of what Hadoop is all about, you can proceed with this book, keeping in mind that for every page in the book there is 5-10X amount of info available online.
It's a good book for covering the breadth and getting into quite some depth in each topic if you are willing to extend from the examples provided. It gets into quite some depth with aspects that matter like the inner workings of Mapper/Shuffle+Sort/Reducer etc.
Missed the 5th star for not covering the 2.0 info or even the annotations behind the API evolution which is a big point for newcomers to the ecosystem.
on 22 February 2015
Excellent book. I was new to Java and Unix so needed some help from other books along the way to be able to run the examples here, but got there in the end. Great that it covers Java MapReduce, Pig and Hive as well as showing how to get HDFS up and running to try it for yourself. Not an easy subject, but the book was a huge help.
on 9 March 2014
This book sets out to cover the entire Hadoop environment, it's a big book but that's a massive subject and it'd be a major challenge to cover in one book. As a result that majority of the book is on the core of Hadoop, HDFS and classic MapReduce. The sections on Pig, Hive and HBase feel tacked on and aren't in any where near as much depth as the initial section of the book. Because it's a 2012 book, it also ignores some of the newer technologies like Spark and Impala.
So it's a good introduction to Hadoop, but a long way short of being "The Definitive Guide".
on 19 April 2014
I was looking for a book that explained Hadoop & HDFS with some technical depth to understand the practical implications of building solutions on Hadoop with a starting point of unix skills, data-warehousing but zero knowledge of Hadoop or MapReduce.
This book definitely meets the bill. It provides clear explanations of HDFS, MapReduce, HIVE, HBASE and more both in terms of what how they work and what they are good for but also provides some technical Java based examples (which I have largely skipped through). The book also covers real world implementations showing various patterns used by major Hadoop consumers that make use of the various toolsets which for me helped to cement the ideas and strengths of the various elements.
I have already recommended this book to others.
The only reason for not rating it higher is that I cannot yet testify to the quality of the code samples.
on 12 November 2012
I've found this book to be the perfect kickstarter for a novice in Hadoop (previous technical skills prerequisite though).
It features a balanced coverage of concepts, architecture, planning, programming, deployment, administration and to a small extent, tuning.
I think the book does a good service to the technology covering it from a multitude of aspects to show integration capabilities and demonstrating its versatility.
I like the author's ability to lead readers of various thinking habits (relational, OOAD, dimensional) into discovering Hadoop by starting from one's own comfortable perspective.
If there is only one thing I'd improve - make clearer the importance of writing a smart Map Reducer, provide a formal definition of what makes a function compliant with the Combiner and provide tips&tricks on writing effective Map Reducers.