Machine learning with R : learn how to use R to apply powerful machine learning methods and gain an insight into real-world applications /

Written as a tutorial to explore and understand the power of R for machine learning. This practical guide that covers all of the need to know topics in a very systematic way. For each machine learning approach, each step in the process is detailed, from preparing the data for analysis to evaluating...

Full description

Saved in:
Bibliographic Details
Main Author: Lantz, Brett (Author)
Format: Electronic eBook
Language:English
Published: Birmingham, UK : Packt Publishing, 2013.
Series:Community experience distilled.
Subjects:
Online Access: Full text (Emmanuel users only)
Table of Contents:
  • Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Introducing Machine Learning; The origins of machine learning; Uses and abuses of machine learning; Ethical considerations; How do machines learn?; Abstraction and knowledge representation; Generalization; Assessing the success of learning; Steps to apply machine learning to your data; Choosing a machine learning algorithm; Thinking about the input data; Thinking about types of machine learning algorithms; Matching your data to an appropriate algorithm.
  • Using R for machine learningInstalling and loading R packages; Installing an R package; Installing a package using the point-and-click interface; Loading an R package; Summary; Chapter 2: Managing and Understanding Data; R data structures; Vectors; Factors; Lists; Data frames; Matrixes and arrays; Managing data with R; Saving and loading R data structures; Importing and saving data from CSV files; Importing data from SQL databases; Exploring and understanding data; Exploring the structure of data; Exploring numeric variables; Measuring the central tendency
  • mean and median.
  • Measuring spread
  • quartiles and the five-number summaryVisualizing numeric variables
  • boxplots; Visualizing numeric variables
  • histograms; Understanding numeric data
  • uniform and normal distributions; Measuring spread
  • variance and standard deviation; Exploring categorical variables; Measuring the central tendency
  • the mode; Exploring relationships between variables; Visualizing relationships
  • scatterplots; Examining relationships
  • two-way cross-tabulations; Summary; Chapter 3: Lazy Learning
  • Classification using Nearest Neighbors; Understanding classification using nearest neighbors.
  • The kNN algorithmCalculating distance; Choosing an appropriate k; Preparing data for use with kNN; Why is the kNN algorithm lazy?; Diagnosing breast cancer with the kNN algorithm; Step 1
  • collecting data; Step 2
  • exploring and preparing the data; Transformation
  • normalizing numeric data; Data preparation
  • creating training and test datasets; Step 3
  • training a model on the data; Step 4
  • evaluating model performance; Step 5
  • improving model performance; Transformation
  • z-score standardization; Testing alternative values of k; Summary.
  • Chapter 4: Probabilistic Learning
  • Classification using Naive BayesUnderstanding naive Bayes; Basic concepts of Bayesian methods; Probability; Joint probability; Conditional probability with Bayes' theorem; The naive Bayes algorithm; The naive Bayes classification; The Laplace estimator; Using numeric features with naive Bayes; Example
  • filtering mobile phone spam with the naive Bayes algorithm; Step 1
  • collecting data; Step 2
  • exploring and preparing the data; Data preparation
  • processing text data for analysis; Data preparation
  • creating training and test datasets.
  • Visualizing text data
  • word clouds.