The Art of Machine Learning

A Hands-On Guide to Machine Learning with R

Look inside
Learn to expertly apply a range of machine learning methods to real data with this practical guide.

Packed with real datasets and practical examples, The Art of Machine Learning will help you develop an intuitive understanding of how and why ML methods work, without the need for advanced math.

As you work through the book, you’ll learn how to implement a range of powerful ML techniques, starting with the k-Nearest Neighbors (k-NN) method and random forests, and moving on to gradient boosting, support vector machines (SVMs), neural networks, and more.

With the aid of real datasets, you’ll delve into regression models through the use of a bike-sharing dataset, explore decision trees by leveraging New York City taxi data, and dissect parametric methods with baseball player stats. You’ll also find expert tips for avoiding common problems, like handling “dirty” or unbalanced data, and how to troubleshoot pitfalls.

You’ll also explore:

  • How to deal with large datasets and techniques for dimension reduction
  • Details on how the Bias-Variance Trade-off plays out in specific ML methods
  • Models based on linear relationships, including ridge and LASSO regression
  • Real-world image and text classification and how to handle time series data

Machine learning is an art that requires careful tuning and tweaking. With The Art of Machine Learning as your guide, you’ll master the underlying principles of ML that will empower you to effectively use these models, rather than simply provide a few stock actions with limited practical use.

Requirements: A basic understanding of graphs and charts and familiarity with the R programming language
"In contrast to other books about machine learning, there is a bigger emphasis on programming and usage in practice. In particular, there is an excellent explanation of how to avoid over/under-fitting, and how to use cross-validation. This book is sure to be helpful for students who are interested to understand the core concepts, as well as their practical implementations in R."
—Toby Dylan Hocking, Assistant Professor, Northern Arizona University

"The Art of Machine Learning by Norman Matloff is a welcome addition to a growing body of books about machine learning. Matloff, whose career spans both computer science and statistics, addresses the new and exciting field with a fresh approach."
—Dirk Eddelbuettel, Department of Statistics, University of Illinois
Norman Matloff is an award-winning professor at the University of California, Davis. Matloff has a PhD in mathematics from UCLA and is the author of The Art of Debugging with GDB, DDD, and Eclipse and The Art of R Programming (both from No Starch Press).
Acknowledgments
Introduction

PART I: PROLOGUE, AND NEIGHBORHOOD-BASED METHODS
Chapter 1: Regression Models
Chapter 2: Classification Models
Chapter 3: Bias, Variance, Overfitting, and Cross-Validation
Chapter 4: Dealing with Large Numbers of Features
PART II: TREE-BASED METHODS
Chapter 5: A Step Beyond k-NN: Decision Trees
Chapter 6: Tweaking the Trees
Chapter 7: Finding a Good Set of Hyperparameters
PART III: METHODS BASED ON LINEAR RELATIONSHIPS
Chapter 8: Parametric Methods
Chapter 9: Cutting Things Down to Size: Regularization
PART IV: METHODS BASED ON SEPARATING LINES AND PLANES
Chapter 10: A Boundary Approach: Support Vector Machines
Chapter 11: Linear Models on Steroids: Neural Networks
PART V: APPLICATIONS
Chapter 12: Image Classification 
Chapter 13: Handling Time Series and Text Data 
Appendix A: List of Acronyms and Symbols 
Appendix B: Statistics and ML Terminology Correspondence
Appendix C: Matrices, Data Frames, and Factor Conversions
Appendix D: Pitfall: Beware of “p-Hacking”!
Interior Spread
Interior Spread
Interior Spread

About

Learn to expertly apply a range of machine learning methods to real data with this practical guide.

Packed with real datasets and practical examples, The Art of Machine Learning will help you develop an intuitive understanding of how and why ML methods work, without the need for advanced math.

As you work through the book, you’ll learn how to implement a range of powerful ML techniques, starting with the k-Nearest Neighbors (k-NN) method and random forests, and moving on to gradient boosting, support vector machines (SVMs), neural networks, and more.

With the aid of real datasets, you’ll delve into regression models through the use of a bike-sharing dataset, explore decision trees by leveraging New York City taxi data, and dissect parametric methods with baseball player stats. You’ll also find expert tips for avoiding common problems, like handling “dirty” or unbalanced data, and how to troubleshoot pitfalls.

You’ll also explore:

  • How to deal with large datasets and techniques for dimension reduction
  • Details on how the Bias-Variance Trade-off plays out in specific ML methods
  • Models based on linear relationships, including ridge and LASSO regression
  • Real-world image and text classification and how to handle time series data

Machine learning is an art that requires careful tuning and tweaking. With The Art of Machine Learning as your guide, you’ll master the underlying principles of ML that will empower you to effectively use these models, rather than simply provide a few stock actions with limited practical use.

Requirements: A basic understanding of graphs and charts and familiarity with the R programming language

Reviews

"In contrast to other books about machine learning, there is a bigger emphasis on programming and usage in practice. In particular, there is an excellent explanation of how to avoid over/under-fitting, and how to use cross-validation. This book is sure to be helpful for students who are interested to understand the core concepts, as well as their practical implementations in R."
—Toby Dylan Hocking, Assistant Professor, Northern Arizona University

"The Art of Machine Learning by Norman Matloff is a welcome addition to a growing body of books about machine learning. Matloff, whose career spans both computer science and statistics, addresses the new and exciting field with a fresh approach."
—Dirk Eddelbuettel, Department of Statistics, University of Illinois

Author

Norman Matloff is an award-winning professor at the University of California, Davis. Matloff has a PhD in mathematics from UCLA and is the author of The Art of Debugging with GDB, DDD, and Eclipse and The Art of R Programming (both from No Starch Press).

Table of Contents

Acknowledgments
Introduction

PART I: PROLOGUE, AND NEIGHBORHOOD-BASED METHODS
Chapter 1: Regression Models
Chapter 2: Classification Models
Chapter 3: Bias, Variance, Overfitting, and Cross-Validation
Chapter 4: Dealing with Large Numbers of Features
PART II: TREE-BASED METHODS
Chapter 5: A Step Beyond k-NN: Decision Trees
Chapter 6: Tweaking the Trees
Chapter 7: Finding a Good Set of Hyperparameters
PART III: METHODS BASED ON LINEAR RELATIONSHIPS
Chapter 8: Parametric Methods
Chapter 9: Cutting Things Down to Size: Regularization
PART IV: METHODS BASED ON SEPARATING LINES AND PLANES
Chapter 10: A Boundary Approach: Support Vector Machines
Chapter 11: Linear Models on Steroids: Neural Networks
PART V: APPLICATIONS
Chapter 12: Image Classification 
Chapter 13: Handling Time Series and Text Data 
Appendix A: List of Acronyms and Symbols 
Appendix B: Statistics and ML Terminology Correspondence
Appendix C: Matrices, Data Frames, and Factor Conversions
Appendix D: Pitfall: Beware of “p-Hacking”!

Photos

Interior Spread
Interior Spread
Interior Spread
  • More Websites from
    Penguin Random House
  • Common Reads
  • Library Marketing