on 26 May 1999
This is the best book I've ever read on computational linguistics. It should be ideal for both linguists who want to learn about statistical language processing and those building language applications who want to learn about linguistics. This book isn't even published and it's now my most highly used reference book, joining gems such as Cormen, Leiserson and Rivest's algorithm book, Quirk et al.'s English Grammar, and Andrew Gelman's Bayesian statistics book (three excellent companions to this book, by the way).
The book is written more like a computer science or math book in that it starts absolutely from scratch, but moves quickly and assumes a sophisticated reader. The first one hundred or so pages provide background in probability, information theory and linguistics.
This book covers (almost) every current trend in NLP from a statistical perspective: syntactic tagging, sense disambiguation, parsing, information retrieval, lexical subcategorization, Hidden Markov Models, and probabilistic context-free grammars. It also covers machine translation and information retrieval in later chapters.
It covers all the statistical techniques used in NLP from Bayes' law through to maximum entropy modeling, clustering: nearest neighbors and decision trees, and much more.
What you won't find is information on applications to higher-level discourse and dialogue phenomena like pronoun resolution or speech act classification.
on 21 May 2008
I think this book is not good enough to be recommended. It has a lot of irrelevant writing (i.e. many paragraphs which do not provide any new information, or interpretation value), explanations are typically "circular" instead of simple, intuitive and to the point, and it has many errors. One such error is absolutely terrifying:
"PCA can only be applied to a square [data] matrix" (page 556 in the second edition from 2000)
In other words, you can only apply PCA if you have as many data points as dimensions of the input space! All the people who I showed this have laughed in response, and this actually convinced my NLP professor to look for another course book.
I got the book from the university library, and I usually consider buying the books I use. I am SO glad I didn't buy this one. I wouldn't pay even 1 euro for it, let alone almost 75 which is the typical price.
Bottom line: don't get it.