Share
Explore

Data Visualization: From Data to Insights

Last edited 10 minutes ago by Slava Melanko
39syng.png
Intro
Raw data analysis may often be difficult and boring, especially when dealing with complex data sets. However, data visualization is a powerful and at the same time easy way to understand and analyze such data sets because it turns tons of numbers into meaningful insights that can be used to make better decisions. In this blog post, I'll try to prove that with a small real-world example.

This post was made with the help of the AI tools like ,
, .
🔎 Problem Statement: Snooping My Neighbors
In February, I noticed a list of utility debts for my building:

debt-list.png
The list of utility debts.
At first glance, the list shows the debt per apartment, the total debt at the end, and nothing more. Being a software engineer with almost a decade of experience, I wanted to create a visual representation of the data for better understanding.
🛠️ Some Technical Details
I want to omit technical topics and only mention that I’ll use with the following libraries:
is built on top of that provides convenient data structures and functions for data manipulation, analysis, and visualization.
provides a range of customizable plotting options for creating static, animated, and interactive visualizations.
is built on top of Matplotlib that provides a high-level interface for creating informative and attractive statistical graphics, including heatmaps, distribution plots, time series visualizations, and many more.
The whole project will be in . It is an open-source web application that allows users to create and share interactive documents containing code, data, and visualizations.
Please look at for more detail.

📊 Data Visualization
The first and easy thing to do is to find the “top 10” debtors, min and max, or the average debt per apartment, e.g.

image.png

Also, check out the distribution of debt, for instance:

debt_hist.png

BTW the doesn’t work here, because 80% of debt come from about 41% of people.
Let’s create a better and more advanced chart that illustrates the distribution of debt across 3 sections in the building:

image.png
The distribution of debt across 3 sections in the building

Despite the fact that the above chart has lots of numbers, I decided to use all of them. At first, it shows the distribution of debt across 3 sections. It also includes the overall picture - the total debt and average debt per apartment in the building. In general, the numbers are shameful, but the second section is slightly better than the others.
Additionally, I like the idea of using separate donuts in the form of a progress bar to show the distribution of debt across 3 sections:

image.png

In general, pie charts are better suited for data that has fewer categories (usually no more than 5) and where the difference in the proportions are large enough to be easily discernible.
In regard to the first section, I would like to mention that there are 21 apartments that are free of debt:

image.png
21 apts. without debt in the 1st section; 20 in the 2nd; 15 in the 3rd.

So, which section has more good citizens? I don’t see a winner here. I think the result of the above analysis can be illustrated by the following image:

cOI9pOV.png

Last but not least, I made the final visualization that shows the distribution of debt per floor in each section:

image.png

Additionally, it contains mean numbers for debt per apartment and per floor in each section.
One of the advantages of using Python for data visualization is the flexibility and customization it offers. For instance, I can apply gradients to represent changes in debt values, making it easier to spot big debtors:

image.png
Final World
People have to pay their utilities. It's essential payment. Please don't be like my neighbors.
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.