Skip to content

Data cleaning i Excel

1) Get Rid of Extra Spaces:

Extra Spaces are difficult to spot & correct. Multiple spaces may be easy, but trailing spaces are pretty tough. Trailing spaces are blank spaces at the end of the statement or word which are not followed by any other character.
Here’s an easy way to spot & eliminate such errors:

Syntax: TRIM(text)

Steps:
Consider data with four cells with different spacing errors.
Now select a column & type “TRIM(
Now select the cell you want to correct (in matters of spaces).
The cell will be corrected. If there are other erroneous cells sequentially aligned, drag the fixed cell till the point, you want to check & correct.
This easy step can save you time!

2) Select & Treat all blank cells:

Blank cells are troublesome because they often create errors while creating reports. And, people usually want to replace such cells with 0, Not Available or something like that. But replacing each cell manually on a large data table would take hours. Luckily, there’s an easy way to tackle this problem.
Steps:
Select the entire Data (you want to treat)
Press F5 (on keyboard)
A dialogue box will appear > Select “Special
Select “Blanks” & click “OK
Now, all blank cells will be highlighted in pale grey color, out of which one cell would be white with a different border. That’s the active cell, type the statement you want to replace in blank cells.
Hit “Ctrl+Enter

NOTE:
At the last step, if “Enter” only is pressed, then the value will be inserted only in the active cell. So remember to press “Ctrl+Enter.”

3) Convert Numbers Stored as Text into Numbers:

When we import data from files, other sources, databases, text, etc. During transit, data might get affected. Also, some have a habit of using an apostrophe before numerical values, which is considered as text in Excel. Such minor data conversion can drastically affect calculations.
Suppose there are three values “70, ’70, 80”. When we compare 70 and 80 (70<80), the result is “TRUE.” But when we compare “apostrophe 70 & 80” (‘70<80), the problem starts. Here the result will be FALSE as the text will be rated higher than any number. To eliminate such errors, here’s a trick.
Steps:
Select any blank cell & type 1
Select that cell & hit “Ctrl+C
Now select your data set & go to Paste > Paste Special
In Paste Special, select “Multiply” option in the “Operation” category
Click “OK

Here it multiples every single value to “1”. And anything multiplied by 1 is the same number. But this trick also takes care of the apostrophe numerical.

4) Remove Duplicates:

Elimination of duplicate data is necessary for the creation of unique data & less usage of storage. In duplication, you can either highlight it or delete it.
A) Highlight Duplicates:
Select the data & go to Home > Conditional Formatting > Highlight Cell Rules > Duplicate Values
A dialogue box will appear (Duplicate Values), Select Duplicate & formatting color
Press OK
All duplicate values will be highlighted!

B) Delete Duplicates:
Select the data & go to DATA > Remove Duplicates
A dialogue box will appear (Remove Duplicates), tick columns whose duplicates need to be found.
Remember to have a click on “My data has headers” (if your Data has headers) or else column heads will be considered as data & duplication search will be applied on it too.
Click OK!

Duplicate values will be removed! Suppose you select 4 of 4 columns. Then that four columns rows should also match or else; they won’t be considered duplicate.

5) Highlight Errors:

While creating reports or dashboards, you might face a few arithmetical errors (like divisional errors). Such errors are easy to spot if the Data is small. But for big data, it’s complicated. So to get rid of such mistakes, you can go for two ways: Conditional Formatting or Go to Special.
A) Using Conditional Formatting:
Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.