Thursday, April 17, 2014

Hadoop definitive guide Harini 9505140022

'Hadoop - The Definitive Guide' Book Review

For those who are interested and serious in getting into Hadoop, besides going through the tons of articles and tutorials on the Internet, 'Hadoop - The Definitive Guide'  is a must have book. Most of the tutorials stop with the 'Word Count' example, but this book goes into the next level explaining the nuts-n-bolts of the Hadoop framework with a lot of examples and references. The most interesting and important thing is that the book also mentions why certain design decisions where made in Hadoop.

Not only the book covers HDFS and MapReduce, but also gives an overview of the layers which sit on top of Hadoop like PigHiveHBaseZooKeeper and Sqoop.

The book could definitely have the following
  • MapReduce is covered in detail, but HDFS internals and fine-tuning are at a high-level.
  • Also, to be in sync with Hadoop development and features, it's absolutely necessary to get source from trunk or from another branch and build, package and try it out.

1 comment: