They explain how to install Apache Nutch as well as Apache Solr. But instead of just pointing their websites there is a list of steps collecting all the commands and files that you have to modify in order to have a proper installation. I think that this example is the best feature of the book. Authors really wanted to avoid people being stuck in a middle point, looking for helping in the web… So don’t be worried if you are just starting to study this field.
However, there is something I didn’t like. The book mentioned some tools that required the installation of a previous version of Nutch. I know it isn’t authors’ fault but it’s a bit confusing.
Are you an expert in web crawling? This book includes too how to use Nutch and Apache Hadoop for running applications in a cluster environment.
In my opinion, if you are interested in this field I would recommend you this book. You could save a lot of time and focus in your data instead of installation problems.