Machine learning with R : learn how to use R to apply powerful machine learning methods and gain an insight into real-world applications /
Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating...
Saved in:
Main Author: | |
---|---|
Format: | Electronic eBook |
Language: | English |
Published: |
Birmingham, UK :
Packt Publishing,
2013.
|
Series: | Community experience distilled.
|
Subjects: | |
Online Access: |
Full text (Emmanuel users only) |
Table of Contents:
- Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Introducing Machine Learning; The origins of machine learning; Uses and abuses of machine learning; Ethical considerations; How do machines learn?; Abstraction and knowledge representation; Generalization; Assessing the success of learning; Steps to apply machine learning to your data; Choosing a machine learning algorithm; Thinking about the input data; Thinking about types of machine learning algorithms; Matching your data to an appropriate algorithm.
- Using R for machine learningInstalling and loading R packages; Installing an R package; Installing a package using the point-and-click interface; Loading an R package; Summary; Chapter 2: Managing and Understanding Data; R data structures; Vectors; Factors; Lists; Data frames; Matrixes and arrays; Managing data with R; Saving and loading R data structures; Importing and saving data from CSV files; Importing data from SQL databases; Exploring and understanding data; Exploring the structure of data; Exploring numeric variables; Measuring the central tendency
- mean and median.
- Measuring spread
- quartiles and the five-number summaryVisualizing numeric variables
- boxplots; Visualizing numeric variables
- histograms; Understanding numeric data
- uniform and normal distributions; Measuring spread
- variance and standard deviation; Exploring categorical variables; Measuring the central tendency
- the mode; Exploring relationships between variables; Visualizing relationships
- scatterplots; Examining relationships
- two-way cross-tabulations; Summary; Chapter 3: Lazy Learning
- Classification using Nearest Neighbors; Understanding classification using nearest neighbors.
- The kNN algorithmCalculating distance; Choosing an appropriate k; Preparing data for use with kNN; Why is the kNN algorithm lazy?; Diagnosing breast cancer with the kNN algorithm; Step 1
- collecting data; Step 2
- exploring and preparing the data; Transformation
- normalizing numeric data; Data preparation
- creating training and test datasets; Step 3
- training a model on the data; Step 4
- evaluating model performance; Step 5
- improving model performance; Transformation
- z-score standardization; Testing alternative values of k; Summary.
- Chapter 4: Probabilistic Learning
- Classification using Naive BayesUnderstanding naive Bayes; Basic concepts of Bayesian methods; Probability; Joint probability; Conditional probability with Bayes' theorem; The naive Bayes algorithm; The naive Bayes classification; The Laplace estimator; Using numeric features with naive Bayes; Example
- filtering mobile phone spam with the naive Bayes algorithm; Step 1
- collecting data; Step 2
- exploring and preparing the data; Data preparation
- processing text data for analysis; Data preparation
- creating training and test datasets.
- Visualizing text data
- word clouds.