Skip to content
Practical Statistics for Data Scientists
  • Pages
    • 1. Exploratory data analysis
      • Elements of structured data
      • Estimates of location
      • Estimates of variability
      • Exploring the data distribution
      • Exploring binary and categorical data
      • Correlation
      • Exploring two or more variables
    • 2. Data distributions
      • Random sampling and sample bias
      • Selection bias
      • Sampling distribution of a statistic
      • The bootstrap
      • Confidence intervals
      • Normal distribution
      • Long-tailed distributions
      • Student's t-distribution
      • Binomial distribution
      • Poisson and related distributions
    • 3. Statistical experiments
      • A/B testing
      • Hypothesis tests
      • Resampling
      • Statistical significance and p-values
      • t-Tests
      • Multiple testing
      • Degrees of freedom
      • ANOVA
      • Chi-squre test
      • Multi-arm bandit algorithm
      • Power and sample size
    • 4. Regression
      • Simple linear regression
      • Multiple linear regression
      • Prediction using regression
      • Factor variables in regression
      • Interpreting the regression equation
      • Testing the assumptions: regression diagnostics
      • Polynomial and spline regression
    • 5. Classification
      • Naive Bayes
      • Discriminant analysis
      • Logistic regression
      • icon picker
        Evaluating classification models
      • Strategies for imbalanced data
    • 6. Statistical ML
      • K-nearest neighbours
      • Tree models
      • Bagging and random forest
      • Boosting
    • 7. Unsupervised learning
      • Principal components analysis
      • K-means clustering
      • Hierarchical clustering
      • Model-based clustering
      • Scaling and categorical variables

Evaluating classification models

Accuracy

The percent/proportion of cases classified correctly

image.png

Confusion matrix

A tabular display of the record counts by their predicted and actual classification status

image.png

The rare class problem

Depending on the relative cost, need to make the trade-off between false positives and false negatives

Precision, recall, and specificity

Term
Description
Interpretation
Formula
R code
Specificity
The percent/proportion of 0s correctly classified
Measures a model's ability to predict a negative outcome
image.png
conf_mat[2,2]/sum(conf_mat[2,])
Precision
The percent/proportion of predicted 1s that are actually 1s
The accuracy of a predicted positive outcome
image.png
conf_mat[1,1]/sum(conf_mat[,1])
Sensitivity/Recall
The percent/proportion of 1s correctly classified
Measure the strength of the model to predict a positive outcome
image.png
conf_mat[1,1]/sum(conf_mat[1,])
There are no rows in this table

Receiver Operating Characteristics (ROC) curve

A plot of trade-off between sensitivity and specificity as
image.png

Precision-recall curve


Area under the curve (AUC)


Lift

A measure of how effective the model is at identifying (comparatively rare) 1s at different probability cutoffs

 
Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.