DA-HW4-Xinyu Wu
# Problem 2

Diamond Pricing

## Question 1: Linear Model

### 1. Fairness Judgement

ASSUMPTION：
To simulate the model, all the factors have been taken into consideration (carat, color, clarity, cut, certification, polish, symmetry, and wholesaler)
Since the given condition does not show the value of Wholesaler, predictions from Wholesaler 1 to 3 have all been applied.
FOUND:
The expected price by linear model is \$2916.88 for Wholesaler #1.
The expected price by linear model is \$3031.52 for Wholesaler #2.
The expected price by linear model is \$1450.98 for Wholesaler #3.

### 2. Interpretation

Carat: As one unit increase of carat, the price would be increased by 1872.08.
Color: As the change of color, the price would be changed negatively.
Color Type Coefficient
0
Name
Value
1
ColourE
-164.31
2
ColourG
-229.79
3
ColourH
-301.72
4
ColourI
-339.41
5
ColourJ
-436.10
6
ColourK
-695.55
7
ColourL
-895.61
Clarity: As the change of clarity, the price would be changed positively in most cases and negatively in few cases.
Clarity Type Coefficient
0
Name
Value
1
ClarityI2
-589.02
2
ClaritySI1
654.45
3
ClaritySI2
560.80
4
ClaritySI3
290.02
5
ClarityVS1
744.73
6
ClarityVS2
691.73
7
ClarityVVS1
1015.05
8
ClarityVVS2
732.79
Cut: As the change of cut, the price would be changed positively.
Cut Type Coefficient
0
Name
Value
Column 3
1
Cut.G
48.74
2
Cut.I
84.79
3
Cut.V
78.41
4
Cut.X
93.80
Certification: As the change of certification, the price would be changed negatively in most cases and positively in few cases.
Certification Type Coefficient
0
Name
Value
1
CertificationDOW
-271.03
2
CertificationEGL
-308.77
3
CertificationGIA
10.77
4
CertificationIGI
-131.11
Polish: As type change of Polish, the price would be positively.
Polish Type Coefficient
0
Name
Value
1
PolishG
66.10
2
PolishI
244.09
3
Polishv
136.02
4
PolishV
77.79
5
PolishX
82.89
Symmetry: As type change of Symmetry, the price would be changed positively.
Symmetry Type Coefficient
0
Name
Value
1
SymmetryG
133.72
2
SymmetryV
149.06
3
SymmetryX
136.45
Wholesaler: As the change of Wholesaler, the price would be changed either positively or negatively based on the wholesaler type.
Wholesaler Type Coefficient
0
Name
Value
1
Wholesaler2
114.65
2
Wholesaler3
-1465.89
Since the R-squared value is about 0.9861 which is very high, this model could be defined as a good model.

## Question 2: Second Model (without wholesaler #3)

### 1. Impact and Comparison

FOUND:
By drawing the figure of carat and price relationship, it illustrates that the price of Wholesaler is much smaller than Wholesaler which reflects why Wholesaler #3 should be dropped.
Also, the Carat value in Wholesaler #3 is smaller than others presenting the low value which could be dropped.