on 21 October 2009
Natural Language Processing with Python has to have one of the most intimidating pre-ambles of any book I've picked up. Not only does it set out to cover Natural Language Processing, using the author's own Natural Language Toolkit (NLTK) as the teaching tool, but also teach the basics of Python and good programming techniques. After reading this I put the book aside for a day while I lay down in a darkened room to gather my strength!
Having now worked my way through the book, lets take a look at how well it stands up to it's claims. Bad news first. The coverage of Python really didn't work for me, though I admit that this may be due to my background as a procedural rather than object oriented programmer. Without additional Python resources I was seriously struggling so if you are a complete Python and OO neophyte like me then I would strongly suggest working through either a good Python book or one of the many online tutorials before you attempt to tackle Natural Language Processing with Python. The other main problem was the amount of information that the book tries to cover. I found it helpful to scan read the book to get it into some sort of order in my mind before reading it any depth.
If I do have any other criticism of Natural Language Processing with Python, it's probably that it could probably be more accurately titled something like "Natural Language Toolkit: The Missing Manual".
Right that's the bad news out of the way, now for the good. The actual coverage of NLTK and, to a slightly lesser degree, natural language processing is excellent. The theme of the book is very pragmatic and task centred, so if you have a specific problem in mind which you feel needs a natural language approach then this book could well be the answer to your prayers. On the other hand if you are looking for a more theoretical overview of the subject you may be slightly disappointed. Natural Language Processing with Python certainly covers pretty much all the bases from comparatively simple statistical analysis, through context free grammar parsing and text classification all the way to discourse analysis. OK, some may complain that it's a bit code heavy and theory light but, when you consider that pretty much every chapter in the book has had several large tomes dedicated you can see what an achievement this book is.
In summary; if you have a particular problem that you want to use NLTK for but can't get your head round either the problem or the software buy Natural Language Processing with Python now - your frontal lobes will thank you forever. If you are interested in the field, and especially if you come from a pragmatic viewpoint or are already a Python hacker then you certainly won't be wasting your time or money. If you are terrified by the concept of programming and want an overview of the theory of linguistic analysis then there are probably better books for you out there.
on 29 September 2013
I'm using this book as one of several references for a lecture course in Information Extraction & Big Data I teach at the University of Zurich. Python is easy, concise, and has a lot of libraries, three criteria for making it a good choice for teaching, and the author's NLTK is the library of choice for education in computational linguistics and natural language processing. The book covers a wide range of topics at the beginner's level, and the tandem approach of book + open source software gives the students a "hands on" feeling that they cannot get from introductory textbook alternatives (e.g. Manning/Schütze or Jurafsky/Martin), which therefore supplement rather than substitute "NLP with Python" by Bird et al. Invaluable from a teaching perspective is also that the NLTK software package upon download installs standard datasets, which are useful to the student even without NLTK
The chapters on information extraction, parsing, semantics and managing linguistic data go beyond typical text mining books that only teach bag-of-word approaches and statistical sequence tagging in that logical/propositional semantics and discourse are covered from context-free grammars for parsing sentences to Discourse Representation Theory with lambda calculus for handling the composition of sentence semantics to discourse units, and dealing with the scope of quantifiers. The application of analyzing meaning is shown in a chapter on a toy database, which can be queried in natural language.
Two areas that would be nice to cover in future editions are statistical parsing and statistical machine translation.
For a future second edition, I'd also suggest the authors include an appendix "Hacking NLTK" about the internals of NLTK and how to extend it, to promote development of their tool further.
To sum, the book can be highly recommended to the student or teacher of natural language processing who would like to get practical experience rather than just study dry pseudocode.
on 17 March 2014
I have been using this book to help me with my final year project on text mining in a Computer Science course, and I love it! It was overwhelming at first because I was brand new to Python and natural language processing, but after I learnt a bit more about the topics the book became very helpful for me and I use it almost every day at the moment.
on 3 February 2015
This book is wonderfully written. It's full of interesting examples, and gradually and effortlessly introduces the reader to quite advanced topics in Natural Language Processing, Python, and machine learning. One of the few technical book that is worth reading cover-to-cover, just for the pleasure of it.