- Prime Student members get £10 off with a spend of £40 or more on Books. Enter code SAVE10 at checkout. Enter code SAVE10 at checkout. Here's how (terms and conditions apply)
Machine Learning for Hackers Paperback – 25 Feb 2012
|New from||Used from|
- Choose from over 13,000 locations across the UK
- Prime members get unlimited deliveries at no additional cost
- Find your preferred location and add it to your address book
- Dispatch to this address when you check out
Special offers and product promotions
Frequently bought together
Customers who bought this item also bought
Enter your mobile number or email address below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
To get the free app, enter your mobile phone number.
Would you like to tell us about a lower price?
If you are a seller for this product, would you like to suggest updates through seller support?
Case Studies and Algorithms to Get You Started
About the Author
Drew Conway is a PhD candidate in Politics at NYU. He studies international relations, conflict, and terrorism using the tools of mathematics, statistics, and computer science in an attempt to gain a deeper understanding of these phenomena. His academic curiosity is informed by his years as an analyst in the U.S. intelligence and defense communities.
John Myles White is a PhD candidate in Psychology at Princeton. He studies pattern recognition, decision-making, and economic behavior using behavioral methods and fMRI. He is particularly interested in anomalies of value assessment.
What other items do customers buy after viewing this item?
Top customer reviews
There was a problem filtering reviews right now. Please try again later.
Also, when the algorithms are presented there are sometimes some serious errors (see the review on amazon's US site "Erroneous but entertaining" for more details). The single most shocking example was when a series of numbers was said to show the percentages of variation explained by an analysis, but the series added up to much than 100%. This was by no means the only error, however. The cumulative effect for me was that as I got further and further through the book, I began to have less and less trust in what was being presented to me.
I would characterise the book as being for hackers in the sense that you are encouraged to try a technique and see if it works. One good point is that the book emphasises having a separate test set from your training set.
Trying techniques until you find one that works is probably a good place to start, especially if your interest is in starting to learn the broader field of data science -- getting the data in, analysing it, visualising it -- rather than specialising in the selection and choice of machine learning algorithms themselves (for which Andrew Ng's coursera online course is a far better choice).
So much of the content is about those details, that I just got extremely bored and didn't read more than 1 chapter. Not recommended.
Most helpful customer reviews on Amazon.com
There is way too much time spent on R, dedicated to such things as parsing email messages, and spidering webpages, etc. These are things that no-one with other tools available would do in R. And it's not that it's easier to do it in R, it's actually harder than using an appropriate library, like JavaMail. And yet, while much time is spent in details, like regexes to extract dates (ick!), more interesting R functions are given short shrift.
There's some good material in here, but it's buried under the weight of doing everything in R. If you are a non-programmer, and want to use only one hammer for everything, then R is not a bad choice. But it's not a good choice for developers that are already comfortable with a wider variety of tools.
I'd recommend Programming Collective Intelligence by Segaran, if you would describe yourself as a "Hacker".
Pros: The book is affordable and nicely written. The authors take great care in making the book useful and entertaining and one can immediately start putting things into practise. Also, the R examples are interesting and by itself motivating.
Cons: The book has a couple of very grievous errors, that make me wonder the authors understand the subject matter. This is especially striking in the chapters on PCA and Multidimensional Scaling (which I covered in some depth in the class), but also to a lesser degree in other parts of the book that I have read more thoroughly (like optimization and linear and nonlinear regression). Many errors are not typos or simple mistakes but seem to be proof of a profound misunderstanding of concepts by the authors. I am sorry to be so blunt, but one should not write a book about topics that one is not intimate with. Given that the book is probably quite successful, it propagates error into a community whose members may not have the statistical background to spot the errors immediately. Some methods used in the book are quite hard to understand even for graduate students and to be so nonchalant about the underlying theory can be dangerous. I realize that the book is intended to be superficial with regards to mathematical or conceptual depth, but this combined with some of the presented high-level techniques can easily backfire when people are given the tools, but not the understanding. Especially when the explanations on interpretation are plainly wrong (I am talking about using standard deviations instead of variances, substantive interpretation of methodological artifacts, wrong explanation of R output, etc.). Additionally, certain parts of the book became outdated as soon as the book came out, such as the Google example.
Overall, I do not recommend the book. I now only use it as a collection of nice examples and sometimes borrow bits of their R code.
Much of the text is taken up explaining how to parse strings, change dates, and otherwise munge data into shape to be operated on by statistical functions provided by R. In fact, there is so much of the book in that fashion that I end up skipping through large portions to get back to something that is worth spending time reading about. I can't understand why a programmer would need significant education in string parsing. I was also put off by the vast amount of text explaining basic statistics. Maybe a recent computer science graduate is simply the wrong reader for this book?
I think it is certainly possible to learn the basic principles of machine hacking from this book, and even to put them to good use with R in the same manner displayed in the examples. Indeed, the code and data available for this book would be very useful as prep for an introductory course at an academic institution. To make the best use of the text, you really should be sitting at your computer, reading the text side by side with the code, and operating on the data with R as instructed to do.
Personally, I found that wading through this text wasn't enjoyable it due to the lack of density of material at the depth I was looking for. Other readers may find it is just right for them, but I suspect those readers would not be hackers, contrary to the implication of the title. As best as I can figure, this book would best serve a student scientific researcher who wanted to understand what machine learning was about, and did not have significant prior experience in programming or statistics. Alternatively, if you are significantly distant in years from your time in statistics, or considered learning R one of your goals, this book could work well for you.
I received this book for free as part of the O'Reilly Blogger Review program, which is neat.
I should note that I read this book on the iPhone as an ePub. There were some formatting problems with tables that were distracting, but otherwise it was readable.