
| 机器学习和数据挖掘领域的经典畅销教材 基础理论与实践完美的结合 一部逻辑紧密、内容详实,适合所有相关技术人员参考之书 |
| 《数据挖掘:实用机器学习工具与技术(英文版.第3版)》 preface updated and revised content second edition third edition acknowledgments about the authors part i introduction to data mining chapter 1 what's it all about? 1.1 data mining and machine learning describing structural patterns machine learning data mining 1.2 simple examples: theweather problem and others the weather problem contact lenses: an idealized problem irises: a classic numeric dataset cpu performance: introducing numeric prediction. labor negotiations: a more realistic example soybean classification: a classic machine learning success. .1.3 fielded applications web mining decisions involving judgment screening images load forecasting diagnosis marketing and sales other applications 1.4 machine learning and statistics 1.5 generalization as search 1,6 data mining and ethics reidentification using personal information wider issues 1.7 further reading chapter 2 input: concepts, instances, and attributes 2.1 what's a concept? 2.2 what's in an example? relations other example types 2.3 what's in an attribute? 2.4 preparing the input gathering the data together arff format sparse data attribute types missing values inaccurate values getting to know your data 2.5 further reading chapter 3 output: knowledge representation 3.1 tables 3.2 linear models 3.3 trees 3.4 rules classification rules association rules rules with exceptions more expressive rules 3.5 instance-based representation 3.6 clusters 3.7 further reading chapter 4 aig0rithms: the basic methods 4.1 inferring rudimentary rules missing values and numeric attributes discussion 4.2 statistical modeling missing values and numeric attributes naive bayes for document classification discussion 4.3 divide-and-conquer: constructing decision trees calculating information highly-branching attributes 4.4 covering algorithms: constructing rules rules versus trees a simple covering algorithm rules versus decision lists 4.5 mining association rules item sets association rules generating rules efficiently discussion 4.8 linear models numeric prediction: linear regression linear classification: logistic regression linear classification using the perceptron linear classification using winnow 4.7 instance-based learning distance function finding nearest neighbors efficiently discussion 4.8 clustering iterative distance-based clustering faster distance calculations discussion 4.9 multi-instance learning aggregating the input aggregating the output discussion 4.10 further reading 4.11 weka implementations chapter 5 credibility: evaluating what's been learned 5.1 training and testing 5.2 predicting performance 5.3 cross-validation 5.4 other estimates leave-one-out cross-validation the bootstrap 5.5 comparing data mining schemes 5.b predicting probabilities quadratic loss function informational loss function discussion 5.7 counting the cost cost-sensitive classification cost-sensitive learning lift charts roc curves recall-precision curves discussion cost curves 5.0 evaluating numeric prediction 5.9 minimum description length principle 5.10 applying the mdl principle to clustering. 5.11 further reading part ii advanced data mining chapter 6 implementations: real machine learning schemes. 6.1 decision trees numeric attributes missing values pruning estimating error rates complexity of decision tree induction from trees to rules c4.5: choices and options cost-complexity pruning discussion 6.2 classification rules criteria for choosing tests missing values, numeric attributes generating good rules using global optimization obtaining rules from partial decision trees rules with exceptions discussion 6.3 association rules building a frequent-pattern tree finding large item sets discussion 6.4 extending linear models maximum-margin hyperplane nonlinear class boundaries support vector regression kernel ridge regression kernel perceptron multilayer perceptrons radial basis function networks stochastic gradient descent discussion 6.5 instance-based learning reducing the number of exemplars pruning noisy exemplars weighting attributes generalizing exemplars distance functions for generalized exemplars generalized distance functions discussion 6.6 numeric prediction with local linear models model trees building the tree pruning the tree nominal attributes missing values pseudocode for model tree induction rules from model trees locally weighted linear regression discussion 6.7 bayesian networks making predictions learning bayesian networks specific algorithms data structures for fast learning discussion 6.8 clustering choosing the number of clusters hierarchical clustering example of hierarchical clustering incremental clustering category utility probability-based clustering the em algorithm extending the mixture model bayesian clustering discussion 6.0 semisupervised learning clustering for classification co4raining em and co-training discussion 6.10 multi-instance learning converting to single-instance learning upgrading learning algorithms dedicated multi-instance methods discussion 6.11 weka implementations chapter 7 data transformations 7.1 attribute selection scheme-independent selection searching the attribute space scheme-specific selection 7.2 discretizing numeric attributes unsupervised discretization entropy-based discretization other discretization methods entropy-based versus error-based discretization converting discrete attributes to numeric attributes 7.3 projections principal components analysis random projections partial least-squares regression text to attribute vectors time series 7.4 sampling reservoir sampling 7.5 cleansing improving decision trees robust regression detecting anomalies one-class learning 7.6 transforming multiple classes to binary ones simple methods error-correcting output codes ensembles of nested dichotomies 7.7 calibrating class probabilities 7.8 further reading 7.9 weka implementations chapter 8 ensemble learning 8.1 combining multiple models 8.9 bagging bias-variance decomposition bagging with costs 8.3 randomization randomization versus bagging rotation forests 8.4 boosting adaboost the power of boosting 8.5 additive regression numeric prediction additive logistic regression 8.6 interpretable ensembles option trees logistic model trees 8.7 stacking 8.8 further reading 8.9 weka implementations chapter 9 moving on: applications and beyond 9.1 applying data mining 9.2 learning from massive datasets 9.3 data stream learning 9.4 incorporating domain knowledge 9.5 text mining 9.6 web mining 9.7 adversarial situations 0.8 ubiquitous data mining 9.9 further reading part iii the weka data mining workbench chapter 10 introduction to weka 10.1 what's in weka? 10.2 how do you use it? 10.3 what else can you do? 10.4 how do you get it? chapter 11 the explorer 11.1 getting started preparing the data loading the data into the explorer building a decision tree examining the output doing it again working with models when things go wrong 11.2 exploring the explorer loading and filtering files training and testing learning schemes do it yourself: the user classifier using a metalearner clustering and association rules attribute selection visualization 11.3 filtering algorithms unsupervised attribute filters unsupervised instance filters supervised filters 11.4 learning algorithms bayesian classifiers trees rules functions neural networks lazy classifiers multi-instance classifiers miscellaneous classifiers 11.5 metalearning algorithms bagging and randomization boosting combining classifiers cost-sensitive learning optimizing performance retargeting classifiers for different tasks 11.6 clustering algorithms 1 1.7 association-rule learners 11.8 attribute selection attribute subset evaluators single-attribute evaluators search methods chapter 12 the knowledge flow interface 12.1 getting started 12.2 components 12.3 configuring and connecting the components 12.4 incremental learning chapter 13 the experimenter 13.1 getting started running an experiment analyzing the results 13.2 simple setup 13.3 advanced setup 13.4 the analyze panel 13.5 distributing processing over several machines chapter 14 the command-line interface 14.1 getting started 14.2 the structure of weka classes, instances, and packages the weka. core package the weka. classifiers package other packages javadoc indexes 14.3 command-line options generic options scheme-specific options chapter 15 embedded machine learning 15.1 a simple data mining application messageclassifiero updatedatao classifymessageo chapter 16 writing new learning schemes 16.1 an example classifier buildclassifiero maketreeo computelnfogaino classifylnstanceo tosourceo main() 16.2 conventions for implementing classifiers capabilities chapter 17 tutorial exercises for the weka explorer 17.1 introduction to the explorer interface loading a dataset the dataset editor applying a filter the visualize panel the classify panel 17.2 nearest-neighbor learning and decision trees the glass dataset attribute selection class noise and nearest-neighbor learning varying the amount of training data interactive decision tree construction 17.3 classification boundaries visualizing 1r visualizing nearest-neighbor learning visualizing naive bayes visualizing decision trees and rule sets messing with the data 17.4 preprocessing and parameter tuning discretization more on discretization automatic attribute selection more on automatic attribute selection automatic parameter tuning 17.5 document classification data with string attributes classifying actual documents exploring the stringtowordvector filter 17.6 mining association rules association-rule mining mining a real-world dataset market basket analysis references index |
商品评论(0条)