Make us homepage
Add to Favorites
FAIL (the browser should render some flash content, not this).

Main page » » Data Mining: Concepts and Techniques, Second Edition


Data Mining: Concepts and Techniques, Second Edition

 
40

It is very easy to collect huge volumes of data - social statistics, bank records, biological data, and more - but very hard to pull useful facts out of the heap. This book is about processing large volumes of data in ways that let simple descriptions emerge.

This is an introductory level book, aimed at someone with reasonably good programming skills. A little facility with statistics might help, but certainly isn't necessary. The book starts gently, with some very basic questions: what is data mining exactly, when there seem to be so many definitions for the term? What is a data warehouse, and how does it differ from a database? Next, the authors address the data itself in terms of quality, usability, and organization for efficient access. The central chapters, 4 thhrough 8, address various kinds of query specification, kinds of relationships to extract, correlations, clustering, and classification. None of the discussions is especially deep. All, however, are presented in pseudocode or simple math that can easily be translated into working code. The careful reader learns a few basic principles that work well in many contexts: entropy maximization, Bayesian analysis, and simple stats. It may be surprising to see how little of normal statistical analysis is used. I suspect the authors assume that stats-savvy readers will already know how to apply significance testing, and that stats-naive readers don't need the distraction. The last chapters discuss complex data, where the best structure for the data and the questions to be asked of it are not at all obvious, and tools and applications used in data mining.

The book is nicely laid out as a textbook, with an orderly summary, problem set, and bibliography at the end of each chapter. The bibliography is more than just a list of names and authors - it actually helps the reader decide which references will give the best description of each of the chapter's topics.

This is a clear, usable introduction to data mining: the data it uses, the questions it answers, and the techniques for connecting them. It gives codable detail for lots of techniques, and prepares the reader for more advanced discussions. I recommend it very highly.



Purchase Data Mining: Concepts and Techniques, Second Edition from Amazon.com
Dear user! You need to be registered and logged in to fully enjoy Englishtips.org. We recommend registering or logging in.
Tags: mining, Mining, coverage, Concepts, edition