Skip to content
Project 2: Prediction Challenge

icon picker
DAPI Project 2: Predictions for Highway Operations

DAPI Analytics Project

Project and presentation due on


You are part of an analytics team that is trying to inform amenities design for an interstate highway by evaluation drivers’ preferences. In other words, what type of amenities should be build along the highway.
The data comes from an experiment where a promotional coupon is offered to a driver. The promotion is based on a type of amenity such as Bar, Carry out & Take away, Coffee House, or Restaurant. If the driver accepts the promotion it can be inferred that their preference is toward the same type of amenity service along the highway.
With the coupon a survey is also conducted. The survey describes different driving scenarios including the destination, current time, weather, passenger, etc., and then ask the driver whether he/she will accept the promotion.


Run a predictive model to assess the likelihood of a driver accepting the promotional coupon. Then, interpret your model to recommend amenities for an upcoming highway. Recommendations and insights must be linked to the features of importance of your selected model.
In addition, we will evalute the performance of your model using the team submission file. This file consists a prediction of 2684 drivers with a 1 if the accept the coupon and 0 if they do not. This file will be evaluated using accuracy and f-score

Project and presentation due on

Submission Requirements

This project is competition style, asking teams to build a predictive model, tune it, and score a set of 2685 records. In addition, teams need interpret and provide operations analytics recommendations which should emerge from the data and the results of a predictive modeling.

At the end, teams should complete and submit the following:
Data Review, EDA, and Feature engineering: your aproach for understanding, preparing and transforming data pre-modeling
A 15min Presentation: summary of the business problem, work completed, results, and recommendations
Model Generation in R: development of models, tuning, and selection a best model
Predictions: predictions on a set of new users. Visualize results as a histogram.Use this file name: team#_submission.csv (TIP: Don’t change the layout of the submission file, only add your predictions)


Presentation: 20%
Model performance and technical approach: 50%
Recommendations: 30%

Presentations Expectations

Presentation which includes :15min
Business problem you are trying to solve
Your approach premodeling: (1) Data review (2) EDA and (3) Feature Engineering
Review of modeling attempts(please use a table to summarize attempts)
Selected model and feature of importance
Model interepretation
Implications and recommendations


Data for scoring and submission process

id: unique number giving to the driver doing the survey
destination: No Urgent Place, Home, Work
passanger: Alone, Friend(s), Kid(s), Partner (who are the passengers in the car)
weather: Sunny, Rainy, Snowy
temperature:55, 80, 30
time: 2PM, 10AM, 6PM, 7AM, 10PM
coupon: Restaurant(<$20), Coffee House, Carry out & Take away, Bar, Restaurant($20-$50)
expiration: 1d, 2h (the coupon expires in 1 day or in 2 hours)
gender: Female, Male
age: 21, 46, 26, 31, 41, 50plus, 36, below21
maritalStatus: Unmarried partner, Single, Married partner, Divorced, Widowed
has_Children:1, 0
education: Some college - no degree, Bachelors degree, Associates degree, High School Graduate, Graduate degree (Masters or Doctorate), Some High School
occupation: Unemployed, Architecture & Engineering, Student, Education&Training&Library, Healthcare Support,Healthcare Practitioners & Technical, Sales & Related, Management, Arts Design Entertainment Sports & Media, Computer & Mathematical, Life Physical Social Science, Personal Care & Service, Community & Social Services, Office & Administrative Support, Construction & Extraction, Legal, Retired,Installation Maintenance & Repair, Transportation & Material Moving,Business & Financial, Protective Service,Food Preparation & Serving Related, Production Occupations,Building & Grounds Cleaning & Maintenance, Farming Fishing & Forestry
income: $37500 - $49999, $62500 - $74999, $12500 - $24999, $75000 - $87499,
$50000 - $62499, $25000 - $37499, $100000 or More, $87500 - $99999, Less than $12500
Bar: never, less1, 1~3, gt8, nan4~8 (feature meaning: how many times do you go to a bar every month?)
CoffeeHouse: never, less1, 4~8, 1~3, gt8, nan (feature meaning: how many times do you go to a coffeehouse every month?)
CarryAway:n4~8, 1~3, gt8, less1, never (feature meaning: how many times do you get take-away food every month?)
RestaurantLessThan20: 4~8, 1~3, less1, gt8, never (feature meaning: how many times do you go to a restaurant with an average expense per person of less than $20 every month?)
Restaurant20To50: 1~3, less1, never, gt8, 4~8, nan (feature meaning: how many times do you go to a restaurant with average expense per person of $20 - $50 every month?)
toCoupon_GEQ15min:0,1 (feature meaning: driving distance to the restaurant/bar for using the coupon is greater than 15 minutes)
toCoupon_GEQ25min:0, 1 (feature meaning: driving distance to the restaurant/bar for using the coupon is greater than 25 minutes)
direction_same:0, 1 (feature meaning: whether the restaurant/bar is in the same direction as your current destination)
direction_opp:1, 0 (feature meaning: whether the restaurant/bar is in the same direction as your current destination)
Y: 1, 0 (whether the coupon is accepted)


When building recommendation for predictive models, you MUST consider feature of importance in your model


Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
) instead.