Data Mining and over 1.5 million other books are available for Amazon Kindle . Learn more


or
Sign in to turn on 1-Click ordering.
Trade in Yours
For a £9.75 Gift Card
Trade in
More Buying Choices
Have one to sell? Sell yours here
Sorry, this item is not available in
Image not available for
Colour:
Image not available

 
Start reading Data Mining on your Kindle in under a minute.

Don't have a Kindle? Get your Kindle here, or download a FREE Kindle Reading App.

Data Mining: Practical Machine Learning Tools and Techniques (The Morgan Kaufmann Series in Data Management Systems) [Paperback]

Ian H. Witten , Eibe Frank , Mark A. Hall
4.5 out of 5 stars  See all reviews (2 customer reviews)
RRP: £42.99
Price: £35.91 & this item Delivered FREE in the UK with Super Saver Delivery. See details and conditions
You Save: £7.08 (16%)
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Only 10 left in stock (more on the way).
Dispatched from and sold by Amazon. Gift-wrap available.
Want delivery by Tuesday, 28 May? Choose Express delivery at checkout. See Details

Formats

Amazon Price New from Used from
Kindle Edition £26.93  
Paperback £35.91  
Amazon.co.uk Trade-In Store
Did you know you can trade in your old books for an Amazon.co.uk Gift Card to spend on the things you want? Visit the Books Trade-In Store for more details. Learn more.

Book Description

3 Feb 2011 0123748569 978-0123748560 3

Data Mining: Practical Machine Learning Tools and Techniques offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.

Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research.



*Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects *Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods *Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks-in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualization


Frequently Bought Together

Data Mining: Practical Machine Learning Tools and Techniques (The Morgan Kaufmann Series in Data Management Systems) + Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)
Price For Both: £74.86

Buy the selected items together


Product details

  • Paperback: 664 pages
  • Publisher: Morgan Kaufmann; 3 edition (3 Feb 2011)
  • Language: English
  • ISBN-10: 0123748569
  • ISBN-13: 978-0123748560
  • Product Dimensions: 19.1 x 4.3 x 23.5 cm
  • Average Customer Review: 4.5 out of 5 stars  See all reviews (2 customer reviews)
  • Amazon Bestsellers Rank: 67,589 in Books (See Top 100 in Books)

More About the Authors

Discover books, learn about writers, and more.

Product Description

Review

"The authors provide enough theory to enable practical application, and it is this practical focus that separates this book from most, if not all, other books on this subject."- Dorian Pyle, Director of Modeling at Numetrics and an internationally known author of Data Preparation for Data Mining (Morgan Kaufmann, 1999) and Business Modeling for Data Mining (Morgan Kaufmann, 2003)

"This book would be a strong contender for a technical data mining course. It is one of the best of its kind."- Herb Edelstein, Principal, Data Mining Consultant, Two Crows Consulting.

"It is certainly one of my favorite data mining books in my library"- Tom Breur, Principal, XLNT Consulting, Tilburg, The Netherlands

--Tom Breur, Principal, XLNT Consulting, Tilburg, The Netherlands

About the Author

Ian H. Witten is a professor of computer science at the University of Waikato in New Zealand. He directs the New Zealand Digital Library research project. His research interests include information retrieval, machine learning, text compression, and programming by demonstration. He received an MA in Mathematics from Cambridge University, England; an MSc in Computer Science from the University of Calgary, Canada; and a PhD in Electrical Engineering from Essex University, England. He is a fellow of the ACM and of the Royal Society of New Zealand. He has published widely on digital libraries, machine learning, text compression, hypertext, speech synthesis and signal processing, and computer typography. He has written several books, the latest being Managing Gigabytes (1999) and Data Mining (2000), both from Morgan Kaufmann.

Eibe Frank lives in New Zealand with his Samoan spouse and two lovely boys, but originally hails from Germany, where he received his first degree in computer science from the University of Karlsruhe. He moved to New Zealand to pursue his Ph.D. in machine learning under the supervision of Ian H. Witten, and joined the Department of Computer Science at the University of Waikato as a lecturer on completion of his studies. He is now an associate professor at the same institution. As an early adopter of the Java programming language, he laid the groundwork for the Weka software described in this book. He has contributed a number of publications on machine learning and data mining to the literature and has refereed for many conferences and journals in these areas.>

Mark A. Hall was born in England but moved to New Zealand with his parents as a young boy. He now lives with his wife and four young children in a small town situated within an hour's drive of the University of Waikato. He holds a bachelor's degree in computing and mathematical sciences and a Ph.D. in computer science, both from the University of Waikato. Throughout his time at Waikato, as a student and lecturer in computer science and more recently as a software developer and data mining consultant for Pentaho, an open-source business intelligence software company, Mark has been a core contributor to the Weka software described in this book. He has published a number of articles on machine learning and data mining and has refereed for conferences and journals in these areas.


Inside This Book (Learn More)
Browse Sample Pages
Front Cover | Copyright | Table of Contents | Excerpt | Index
Search inside this book:


Customer Reviews

3 star
0
2 star
0
1 star
0
4.5 out of 5 stars
4.5 out of 5 stars
Most Helpful Customer Reviews
1 of 1 people found the following review helpful
5.0 out of 5 stars A great classic updated 8 July 2011
By mboaj
Format:Paperback
I have been using Witten's book since its first edition for my course on Data Mining. The first edition was probably too limited (few topics, not discussed in depth). The second and this third edition have improved the original project much more. My students seem to prefer this more than Han's textbook.
Comment | 
Was this review helpful to you?
4.0 out of 5 stars Two Paths to Prediction 28 Feb 2013
By John M. Ford TOP 500 REVIEWER
Format:Kindle Edition
This is a good text on machine learning techniques from both the statistics and the machine learning perspectives. The authors note that these fields have developed in parallel with many researchers and practitioners working in each, but few familiar with the full range of techniques in both disciplines. Some procedures, such as tree induction and nearest neighbor clustering techniques, have been developed independently in both fields. However, for the most part statistics has focused on hypothesis testing and machine learning has tried to optimize search through the space of possible hypotheses. This book presents techniques from both traditions.

The organizational structure of the book supports its use as either a comprehensive text or a modular reference. The first section's five chapters introduce the foundations of data mining. In addition to concepts and definitions, there are simple example data sets and accessible descriptions of how both raw data and final analyses are used in this field. A particularly well-written fifth chapter discusses how to evaluate data mining models. It discusses the rationale for holdout samples, the use of cross-validation procedures, and how to avoid over-fitting models. Machine learning texts frequently lack depth on this topic while statistics texts often fail to communicate the consequences of poorly-fitted models. This integration of perspectives is a good one.

Chapters in the second section build on this foundation. Chapter 6 describes how to use ten different techniques to detect and describe patterns in large data sets. This section also describes how to prepare data for data mining, how to combine and transform variables to increase model accuracy, and how to improve prediction by combining different model types using bagging, boosting, and other aggregation techniques. A final chapter outlines directions of current and future research expanding our toolbox of techniques. The eight chapters of the third and final section are a detailed tutorial covering the Weka workbench of machine learning algorithms and data transformation tools.

This book has several communication strengths. The scope is broad for an introductory text. The Further Readings collections at each chapter's end are reasonably brief and point to current and in-depth sources. The text itself contains numerous example analyses and follows the useful strategy of analyzing the same data with several techniques. Its review of algorithms and formulas focuses on explaining how they work rather than on deriving them from general principles. A key strength is the book's close integration with Weka. This ensures that readers can step through analysis procedures, experiment with variations from default paths, and compare the performance of different formulations of the same research problems.

I recommend the book for readers introducing themselves to machine learning. It will take some of your time to learn the techniques and practice using them in Weka, but it will be time well spent. Don't skip the Weka section!
Comment | 
Was this review helpful to you?
Most Helpful Customer Reviews on Amazon.com (beta)
Amazon.com: 4.0 out of 5 stars  29 reviews
70 of 71 people found the following review helpful
5.0 out of 5 stars Worthwhile Update to an Excellent Text 6 Mar 2011
By William B. Dwinnell IV - Published on Amazon.com
Format:Paperback|Amazon Vine™ Review (What's this?)
Context for this review: I am a data miner with 20 years experience, and own the first edition of this book.

Good:
- Accessible writing style
- Broad coverage of algorithms and data mining issues, with an eye toward practical issues
- Needless technical trivia (derivations and the like) are avoided
- Algorithms are completely spelled out: A competent programmer should be able to turn these descriptions into functioning code.
- Third edition makes meaningful improvements on previous editions

Bad(ish):
- Approximately one-third of this book is now devoted to the WEKA data mining software. I have nothing against WEKA, and it is a good choice for a text such as this, since WEKA is free. In my opinion, though, this coverage consumes too many pages of this book.
- Data mining draws from a number of fields with separate roots (statistics, machine learning, pattern recognition, engineering, etc.), and many techniques go by multiple names. As with many other data mining books, this one does not always point out the aliases by which data mining methods are known.

The bottom line: This is still the best data mining text on the market.
24 of 25 people found the following review helpful
4.0 out of 5 stars Applying Machine Learning to Data Mining problems 1 April 2011
By owookiee - Published on Amazon.com
Format:Paperback|Amazon Vine™ Review (What's this?)
The subtitle of the book should really be emphasized more: Practical Machine Learning Tools and Techniques. This isn't a book about adhoc SQL queries and database statistics, it is about tools to discover relationships you didn't know you were looking for. Much of the book shows how to handle knowledge formation and representation, statistical modeling and projections. The one critique I have in regard is that much of the algorithm breakdowns are done in prose rather than true pseudocode.

I would like to echo other reviews that point out the text focuses on WEKA, and the authors indicate this is by intent. Though they do give much generic information, at some point you have to pick a horse to hitch your carriage to, and an established open-source project in Java is probably most widely accessible. Their coverage of WEKA claims 50% more features than the 2nd ed. and indeed it consumes half the book. I feel this is a good thing, as it lends great practicality to the book, allowing you to dig right in and get something actually done.

There are some additions to the 3rd ed. that modernize the book a bit. Showing how data can be reidentified (and the ethical implications) is pertinent to today's HIPAA-regulated medical environments. They also touch on web and ubiquitous mining, reflecting our growing foray into non-traditional cloud sources of information.
21 of 23 people found the following review helpful
4.0 out of 5 stars Mixed Opinion 28 April 2011
By GX - Published on Amazon.com
Format:Paperback|Amazon Vine™ Review (What's this?)
Fantastic book if you need to use WEKA; probably the best recommendation available.

If, however, you're not going to be using WEKA then the book is still valuable, but I challenge the true 'practicality' of it. The content is thorough but perhaps more academically oriented than as industry focused as I would have liked. The author keeps it very accessible, particularly as far as mathematics and statistics go. While this might make the book a little more long winded - in my view it makes it a far easier to get into the groove and allows you to read it like a book.

* Highly recommended for WEKA users
* For others users I suggest you look through to see if it will really be helpful before plunking down the cash
Were these reviews helpful?   Let us know
Search Customer Reviews
Only search this product's reviews

Customer Discussions

This product's forum
Discussion Replies Latest Post
No discussions yet

Ask questions, Share opinions, Gain insight
Start a new discussion
Topic:
First post:
Prompts for sign-in
 

Search Customer Discussions
Search all Amazon discussions
   


Listmania!


Look for similar items by category


Feedback


Amazon.co.uk Privacy Statement Amazon.co.uk Delivery Information Amazon.co.uk Returns & Exchanges