Skip to content
Gallery
DA-HW4-Xinyu Wu
Share
Explore

icon picker
Problem 2

Diamond Pricing

Question 1: Linear Model

1. Fairness Judgement

ASSUMPTION:
To simulate the model, all the factors have been taken into consideration (carat, color, clarity, cut, certification, polish, symmetry, and wholesaler)
Since the given condition does not show the value of Wholesaler, predictions from Wholesaler 1 to 3 have all been applied.
FOUND:
The expected price by linear model is $2916.88 for Wholesaler #1.
The expected price by linear model is $3031.52 for Wholesaler #2.
The expected price by linear model is $1450.98 for Wholesaler #3.

2. Interpretation

Carat: As one unit increase of carat, the price would be increased by 1872.08.
Color: As the change of color, the price would be changed negatively.
Color Type Coefficient
Name
Value
1
ColourE
-164.31
2
ColourG
-229.79
3
ColourH
-301.72
4
ColourI
-339.41
5
ColourJ
-436.10
6
ColourK
-695.55
7
ColourL
-895.61
There are no rows in this table
Clarity: As the change of clarity, the price would be changed positively in most cases and negatively in few cases.
Clarity Type Coefficient
Name
Value
1
ClarityI2
-589.02
2
ClaritySI1
654.45
3
ClaritySI2
560.80
4
ClaritySI3
290.02
5
ClarityVS1
744.73
6
ClarityVS2
691.73
7
ClarityVVS1
1015.05
8
ClarityVVS2
732.79
There are no rows in this table
Cut: As the change of cut, the price would be changed positively.
Cut Type Coefficient
Name
Value
Column 3
1
Cut.G
48.74
2
Cut.I
84.79
3
Cut.V
78.41
4
Cut.X
93.80
There are no rows in this table
Certification: As the change of certification, the price would be changed negatively in most cases and positively in few cases.
Certification Type Coefficient
Name
Value
1
CertificationDOW
-271.03
2
CertificationEGL
-308.77
3
CertificationGIA
10.77
4
CertificationIGI
-131.11
There are no rows in this table
Polish: As type change of Polish, the price would be positively.
Polish Type Coefficient
Name
Value
1
PolishG
66.10
2
PolishI
244.09
3
Polishv
136.02
4
PolishV
77.79
5
PolishX
82.89
There are no rows in this table
Symmetry: As type change of Symmetry, the price would be changed positively.
Symmetry Type Coefficient
Name
Value
1
SymmetryG
133.72
2
SymmetryV
149.06
3
SymmetryX
136.45
There are no rows in this table
Wholesaler: As the change of Wholesaler, the price would be changed either positively or negatively based on the wholesaler type.
Wholesaler Type Coefficient
Name
Value
1
Wholesaler2
114.65
2
Wholesaler3
-1465.89
There are no rows in this table

3. Model Comments

Since the R-squared value is about 0.9861 which is very high, this model could be defined as a good model.

Question 2: Second Model (without wholesaler #3)

1. Impact and Comparison

Screen Shot 2022-03-08 at 10.30.30.png
FOUND:
By drawing the figure of carat and price relationship, it illustrates that the price of Wholesaler is much smaller than Wholesaler which reflects why Wholesaler #3 should be dropped.
Also, the Carat value in Wholesaler #3 is smaller than others presenting the low value which could be dropped.

2. Model Comments

Model 2:
FOUND:
Comparing model 1 and 2, there is not much difference between the two models as the whole.
The Carat has a significant and high influence to the price.
The R-Square value for model 2 (0.7798) is lower than model 1 (0.9861).
The correctness of model 2 is higher than model 1.
The goodness of fit can only show the accuracy of the model that fits the sample data, but it falls to present if it is suitable to explain the real situation. Hence, over-fitting should be avoided in the future analysis.

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.