Clearer Decision-Making with PCA
Share
Explore

If you don’t mind alcohol, this example suggests some drinks might be better than others... The following table summarises the drinking habits of a handful of countries, along with key health indicators.
Kudos: This analysis is adapted from an example given at .
Drinking Habits by Country
0
Name
Spirits
Wine
Beer
Life Expectancy
Heart Disease Rate
1
Austria
1.2
15.7
102.1
78
173.0
2
Czech Republic
1.0
1.7
140.0
73
283.7
3
France
2.5
63.5
40.1
78
61.1
4
Italy
0.9
58.0
25.1
78
94.1
5
Japan
2.1
1.0
55.0
79
34.7
6
Mexico
0.8
0.2
50.4
73
36.4
7
Russia
3.8
2.7
17.1
69
373.6
8
Switzerland
1.7
46.0
65.0
78
106.4
9
UK
1.5
12.2
100.0
77
199.7
10
USA
2.0
8.9
87.8
76
176.0
There are no rows in this table

Using the PCA pack, we can easily compute the first two Principal Components and represent this dataset along them:
Drinking Habits along first two Principal Components
5
Not synced yet
There is clearly an outlier, but let’s interpret the Principal Components first:
7
Not synced yet
Quite predictably, life expectancy and heart disease rate are negatively correlated. Thus when moving from right to left on the map above, life expectancy increases and heart disease rate decreases.
More surprising is the positive correlation between between life expectancy and wine consumption for these countries: the more you drink wine, the more you’re expected to live! (Full disclosure: I’m French). On the other hand, there’s a positive correlation between heart diseases and spirits consumption.
7
Not synced yet
These loadings show a negative correlation between beer on one side, and wine & spirits on the other. Put differently, countries of our dataset can be distinguished by their preference between beer and drinks with a higher percentage of alcohol (wine or spirits).
These loadings explain the vertical dispersion of the countries in the map above: from top to bottom, consumption of beer decreases while consumption of higher-alcohol beverages increases.
🍷 or 🍺, anyone?

These explanations are to be taken carefully, thought: the first principal component explains only
46.03%
of our data, while the use of two components explains
78.14%
Beside this caveat, life expectancy and/or heart disease rates depend of course on many more variables which are not part of our dataset, such as access to health care, etc.
🧭 Now you’re well equipped to use PCA and create a ranking or map for your multidimensional data. Knowing you have the best visual representation will greatly simplify your decision-making!
👉 To use Principal Component Analysis in your own docs, see .