Handbook of Statistical Analysis and Data Mining Applications, Second Edition, is a comprehensive professional reference book that guides business analysts, scientists, engineers and researchers, both academic and industrial, through all stages of data analysis, model building and implementation. The handbook helps users discern technical and business problems, understand the strengths and weaknesses of modern data mining algorithms and employ the right statistical methods for practical application.
This book is an ideal reference for users who want to address massive and complex datasets with novel statistical approaches and be able to objectively evaluate analyses and solutions. It has clear, intuitive explanations of the principles and tools for solving problems using modern analytic techniques and discusses their application to real problems in ways accessible and beneficial to practitioners across several areas—from science and engineering, to medicine, academia and commerce.
Key Features
- Includes input by practitioners for practitioners
- Includes tutorials in numerous fields of study that provide step-by-step instruction on how to use supplied tools to build models
- Contains practical advice from successful real-world implementations
- Brings together, in a single resource, all the information a beginner needs to understand the tools and issues in data mining to build successful data mining solutions
- Features clear, intuitive explanations of novel analytical tools and techniques, and their practical applications
Part 1: History Of Phases Of Data Analysis, Basic Theory, And The Data Mining Process
1. The Background for Data Mining Practice
2. Theoretical Considerations for Data Mining
3. The Data Mining and Predictive Analytic Process
4. Data Understanding and Preparation
5. Feature Selection
6. Accessory Tools for Doing Data Mining
Part 2: The Algorithms And Methods In Data Mining And Predictive Analytics And Some Domain Areas
7. Basic Algorithms for Data Mining: A Brief Overview
8. Advanced Algorithms for Data Mining
9. Classification
10. Numerical Prediction
11. Model Evaluation and Enhancement
12. Predictive Analytics for Population Health and Care
13. Big Data in Education: New Efficiencies for Recruitment, Learning, and Retention of Students and Donors
14. Customer Response Modeling
15. Fraud Detection
Part 3: Tutorials And Case Studies
Tutorial A Example of Data Mining Recipes Using Windows 10 and Statistica 13
Tutorial B Using the Statistica Data Mining Workspace Method for Analysis of Hurricane Data (Hurrdata.sta)
Tutorial C Case Study—Using SPSS Modeler and STATISTICA to Predict Student Success at High-Stakes Nursing Examinations (NCLEX)
Tutorial D Constructing a Histogram in KNIME Using MidWest Company Personality Data
Tutorial E Feature Selection in KNIME
Tutorial F Medical/Business Tutorial
Tutorial G A KNIME Exercise, Using Alzheimer’s Training Data of Tutorial F
Tutorial H Data Prep 1-1: Merging Data Sources
Tutorial I Data Prep 1–2: Data Description
Tutorial J Data Prep 2-1: Data Cleaning and Recoding
Tutorial K Data Prep 2-2: Dummy Coding Category Variables
Tutorial L Data Prep 2-3: Outlier Handling
Tutorial M Data Prep 3-1: Filling Missing Values With Constants
Tutorial N Data Prep 3-2: Filling Missing Values With Formulas
Tutorial O Data Prep 3-3: Filling Missing Values With a Model
Tutorial P City of Chicago Crime Map: A Case Study Predicting Certain Kinds of Crime Using Statistica Data Miner and Text Miner
Tutorial Q Using Customer Churn Data to Develop and Select a Best Predictive Model for Client Defection Using STATISTICA Data Miner 13 64-bit for Windows 10
Tutorial R Example With C&RT to Predict and Display Possible Structural Relationships
Tutorial S Clinical Psychology: Making Decisions About Best Therapy for a Client
Part 4: Model Ensembles, Model Complexity; Using the Right Model for the Right Use, Significance, Ethics, and the Future, and Advanced Processes
16. The Apparent Paradox of Complexity in Ensemble Modeling
17. The "Right Model" for the "Right Purpose": When Less Is Good Enough
18. A Data Preparation Cookbook
19. Deep Learning
20. Significance versus Luck in the Age of Mining: The Issues of P-Value "Significance" and "Ways to Test Significance of Our Predictive Analytic Models"
21. Ethics and Data Analytics
22. IBM Watson
- Hastie, Tibshirani and Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction, 2e, Springer Series in Statistics, 9780387848570, 2009, $89.95
- Ross, Introduction to Probability and Statistics for Engineers and Scientists, 5e, Aug 2014, 9780123948113, $110.00
- Zwillinger, The Handbook of Differential Equations, 3e, 9780127843964, Oct. 1997, $125.00