Tika in Action Paperback – 11 Dec 2011
- Choose from over 13,000 locations across the UK
- Prime members get unlimited deliveries at no additional cost
- Find your preferred location and add it to your address book
- Dispatch to this address when you check out
Customers Who Bought This Item Also Bought
Enter your mobile number below and we'll send you a link to download the free Kindle App. Then you can start reading Kindle books on your smartphone, tablet, or computer - no Kindle device required.
Getting the download link through email is temporarily not available. Please check back later.
To get the free app, enter your mobile phone number.
About the Author
Chris Mattmann has a wealth of experience in software design, and in the construction of large-scale data-intensive systems. His work has infected a broad set of communities, ranging from helping NASA unlock data from its next generation of earth science system satellites, to assisting graduate students at the University of Southern California (his Alma mater) in the study of software architecture, all the way to helping industry and open source as a member of the Apache Software Foundation. When he's not busy being busy, he's spending time with his lovely wife and son braving the mean streets of Southern California.
Jukka Zitting is a core Tika developer with over a decade of experience of open source content management. Jukka works as a Senior Developer for the Swiss content management company Day Software, and is a member of the JCP expert group for the Content Repository for Java Technology API. He is a member of the Apache Software Foundation and the chairman of the Apache Jackrabbit project.
Top Customer Reviews
Most Helpful Customer Reviews on Amazon.com (beta)
It's page after page of generalized talk and talk and talk and talk and -- LOOK! A diagram with a smiley face! -- and talk and talk and then one tiny snippet of code completely isolated from any other code that might be needed to make something actually happen. It's like ordering a book titled "Hot Models in Bikinis" and getting a book that talked endlessly about the history of the development of the bikini entirely in text, then talked about the history of textiles used in the manufacturing of bathing suits, and then the timeline in the day of the life of a model, etc, and that was it.
Thumb through your favorite "In Action" series book and you'll find something very different: brief targeted discussion, code that shows what was just discussed -- wait for it -- *IN ACTION!*, brief targeted discussion, code that shows it in action, lather-rinse-repeat, index, back cover. For a good example of how "Tika In Action" should have been structured, look at "Lucene In Action, Second Edition."
All this being said, even if I had been able to hold this book in my hands and leaf through it before making my purchase decision, I WOULD have bought it because of the extremely valuable background discussions, about 10% to 20% of which I would have read to get a better understanding of the subject that is Tika. But I would have then immediately gone looking for the book I really needed, which would have shown Tika actually "In Action."
All things considered, a very readable book and a great resource for anyone using Tika.
Book provides comprehensive description of framework itself, how to use it for different tasks (file format & language detection, text/metadata extraction, etc.), how to extend it to support new file formats (both detection & data extraction). Besides this, there are several chapters dedicated to real world use-cases - how Apache Tika is used in different projects.
I would recommend this book for everybody who need to perform media type detection and/or text/metadata extraction, especially who're working with indexing & searching of heterogeneous documents.
P.S. I gave 4 stars only because I would like to have more detailed description of how to create complex signatures for file formats (although, this information could be found on project's pages).
Look for similar items by category
- Books > Computing & Internet > Computer Science > Information Systems
- Books > Computing & Internet > Databases > Applications
- Books > Computing & Internet > Databases > Data Storage & Management > Data Mining
- Books > Computing & Internet > Databases > Data Storage & Management > Database Management Systems
- Books > Computing & Internet > Programming > Languages