icon picker
PROBLEM 3: Car Evaluation

Code

# Create the training and testing data samples
df_car <- read.csv("https://raw.githubusercontent.com/jcbonilla/BusinessAnalytics/master/BAData/car-data.csv", stringsAsFactors = T)
head(df_car)
df_car$Boot.Space <- as.factor(df_car$Boot.Space)
df_car$Safety <- as.factor(df_car$Safety)
car_model <- glm(Car.Acceptability ~ Car.Price+Main..Price+Doors+Persons+Boot.Space+Safety, data = df_car, family = "binomial"(link="logit"))
summary(car_model)

set.seed(100)
split <- (0.8)
trainingRowIndex <- sample(1:nrow(df_car),(split)*nrow(df_car))
trainingData <- df_car[trainingRowIndex, ]
testData <- df_car[-trainingRowIndex, ]

# Develop the model on training data and make prediction on testing data
library(rpart)
formula <- Car.Acceptability ~ Car.Price+Main..Price+Doors+Persons+Boot.Space+Safety
car.rpart <- rpart(formula, data = trainingData, control = rpart.control(minsplit = 2))
install.packages("rattle")
library(rattle)
fancyRpartPlot(car.rpart,tweak=1.2)

#Select the tree with the minimum prediction error
opt <- which.min(car.rpart$cptable[, "xerror"])
cp <- car.rpart$cptable[opt, "CP"]
# Prune tree
car.prune <- prune(car.rpart, cp=cp)
fancyRpartPlot(car.prune,tweak=1)

# Make prediction and calculate accuracy
car.predict <- predict(car.prune, testData, type="class")
accuracy <- sum(testData$Car.Acceptability==car.predict)/length(car.predict)
accuracy

Results

Decision tree after the prune
image.png

Conclusion

The classification model has an accuracy of 97.98%.
The bestseller is the car that has 0 or 1 safety, can fit at least 3 people, with low or medium price, 0 or 1 boot space. The car sellers should have more cars like can meet these requirements.
Safety is a very important criterion when people decide whether to buy the car or not, so the car sellers should pay more attention to the safety of cars, make sure that the car’s safety indicator is at least 0.
Customers preferred cars with at least 3 seats, so sedans and SUVs are more acceptable than roadsters.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.