How to Clean Data in Excel: The Five Problems and Every Fix

Cleaning data in Excel means fixing five things: extra whitespace, inconsistent capitalization, mixed phone and date formats, duplicates, and values stored as the wrong type. You can fix all five…

Cleaning data in Excel means fixing five things: extra whitespace, inconsistent capitalization, mixed phone and date formats, duplicates, and values stored as the wrong type. You can fix all five with built-in Excel functions, and this guide shows each one, or you can upload your CSV to our free cleaning tool and review the changes in one pass.

The five problems hiding in most spreadsheets

Almost every messy spreadsheet, whether it came out of QuickBooks, a CRM export, or years of hand entry, breaks in the same five ways:

Fixing each problem with Excel functions

1. Whitespace: TRIM and CLEAN

=TRIM(A2) removes leading and trailing spaces and collapses repeated spaces between words. =CLEAN(A2) strips non-printing characters that often ride along in exports. Combine them: =CLEAN(TRIM(A2)). Put the formula in a helper column, fill down, then paste back as values.

2. Capitalization: PROPER, UPPER, LOWER

=PROPER(A2) title-cases names. Watch its known weakness: it produces “O’neil” and “Mcdonald” because it only capitalizes letters after spaces. =LOWER(A2) is the right call for email addresses, which are case-insensitive in practice.

3. Phone and date formats: harder than they look

Excel has no built-in phone standardizer. The usual approach is stripping non-digits with nested SUBSTITUTE calls, then reformatting with TEXT:

=TEXT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,"-",""),"(",""),")",""),".",""),"(000) 000-0000")

That handles the common cases and silently mangles anything with a country code or extension. Dates are worse: a column mixing real date serials with text dates (“Jan 3 2024”) cannot be fixed with one formula. This category is where a purpose-built tool earns its keep; our cleaning rules handle the format variants and show you a before/after preview instead of a formula error.

4. Duplicates: Remove Duplicates, carefully

Data > Remove Duplicates works only AFTER whitespace and capitalization are fixed, because Excel compares exact values. Run it last, never first, and always on a copy.

5. Numbers stored as text

Select the column, look for the green triangle warnings, and use Data > Text to Columns with no delimiters to force re-evaluation. For ZIP codes, format the column as Text BEFORE pasting data, or restore lost leading zeros with =TEXT(A2,"00000").

Excel functions vs. a cleaning tool

TaskIn ExcelWith the free tool
Trim whitespace=CLEAN(TRIM(A2)) + paste valuesAutomatic
Title-case names=PROPER(A2), breaks on O’NeilAutomatic, apostrophes handled
Lowercase emails=LOWER(A2)Automatic
Standardize phonesNested SUBSTITUTE formulaAutomatic for US formats
Mixed date formatsNo single-formula fixNormalized to YYYY-MM-DD
Preview before applyingManual side-by-side columnsBuilt-in diff with confidence score

The honest version of this comparison: if your file is one column of names, Excel is faster. If it is a real export with phones, emails, and dates all messy at once, the formulas above become a 30-minute project with edge cases. The free tool handles CSV files up to 2MB or 2,500 rows, shows every proposed change before applying it, and exports a clean CSV. No account needed.

A safe cleaning workflow, in order

  1. Save a copy of the original file. Always.
  2. Fix whitespace first (TRIM/CLEAN). Everything downstream depends on it.
  3. Normalize capitalization (PROPER for names, LOWER for emails).
  4. Standardize formats (phones, dates, ZIPs).
  5. Remove duplicates last, once values are comparable.
  6. Spot-check 10 random rows against the original before you trust the result.

FAQ

Does TRIM remove all whitespace?

It removes leading and trailing spaces and collapses internal runs of spaces to one. It does not remove non-breaking spaces (CHAR(160)) that web exports often contain; use SUBSTITUTE(A2,CHAR(160),” “) first, then TRIM.

Why does PROPER break names like O’Neil?

PROPER only capitalizes the first letter after a non-letter character and lowercases the rest, so O’NEIL becomes O’neil. You have to fix those by hand, or use a tool whose name rules handle apostrophes and hyphens.

Is there a free tool that does all of this at once?

Yes. Our data cleaning tool is free for CSV files up to 2MB or 2,500 rows, requires no signup, and shows a before/after preview of every change. The rules it applies are documented on the How Cleaning Works page.

Related guides

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *