Monday, September 22, 2014

Review of a new Cloudera book by Packt

The Cloudera Administration Handbook written by Rohit Menon is a fantastic resource for anybody wanting to understand and manage a Cloudera platform.

I have to admit that I’m a rookie and that this book was exactly what I was dreaming of. Having all information in the same place, and code example both for Linux and Windows.

The book is mainly targeted at bid data expert and system administrator. The first three chapters are giving the minimum background to understand MapReduce, Hadoop and Yarn and the  Cloudera's Distribution Including Apache Hadoop (all services are listed and explained).

Then, you enter into the “hard part”. Chapter 4 discussing in details HDFS Federation and Its High Availability and chapter 7 describing “Managing an Apache Hadoop Cluster” were for me particularly valuable. The chapter 5 presenting Cloudera Manager, a web-browser-based administration tool to manage Apache Hadoop clusters, will show you how to manage the clusters with point and clicks instead of command lines. Chapter 6 is about configuring access and right using the Kerberos services. It does show you how to implement the security services, but not how to manage user rights, which is a step requiring some planning. Monitoring and backup (using the Hadoop utility DistCp and the Cloudera manager). are also presented in two distinct parts.

What I like in this book is that it goes directly to the point, assuming you already know the basics of system administration and distributed architecture. It then shares many “tips” that only an experienced professional will know, and enables the rookie I was to avoid mistakes. With this book, you will gain time. For example, the author told you when a SPOF (single point of failure) exist and the solutions to avoid them.

The only part of the book that was missing for me was the cloud deployment. I would have liked a chapter explaining how to setup cloudera in the cloud, and get the code (puppet or chef) to automate the install.

It is clearly a worth buying book for people wanting to setup and deploy correctly a Cloudera platform. I also like the fact that for the same price you can download the PDF, mobi, epub and kindle version.

