Cleaning data in Excel means fixing five things: extra whitespace, inconsistent capitalization, mixed phone and date formats, duplicates, and values stored as the wrong type. You can fix all five with built-in Excel functions, and this guide shows each one, or you can upload your CSV to our free cleaning tool and review the changes in one pass.
The five problems hiding in most spreadsheets
Almost every messy spreadsheet, whether it came out of QuickBooks, a CRM export, or years of hand entry, breaks in the same five ways:
- Stray whitespace. Leading spaces, trailing spaces, and doubled spaces inside values. Invisible, but ” John Smith” and “John Smith” are different values to Excel, so lookups and de-duplication fail.
- Inconsistent capitalization. JOHN SMITH, john smith, and John Smith in the same column.
- Mixed formats. Phone numbers as 555.123.4567, (555) 123-4567, and 5551234567; dates as 12/25/2023 and Dec 25, 2023.
- Duplicates. The same record entered twice, often hidden by the whitespace and capitalization problems above.
- Numbers stored as text. ZIP codes and IDs that lost leading zeros, or amounts that will not sum because Excel sees text.
Fixing each problem with Excel functions
1. Whitespace: TRIM and CLEAN
=TRIM(A2) removes leading and trailing spaces and collapses repeated spaces between words. =CLEAN(A2) strips non-printing characters that often ride along in exports. Combine them: =CLEAN(TRIM(A2)). Put the formula in a helper column, fill down, then paste back as values.
2. Capitalization: PROPER, UPPER, LOWER
=PROPER(A2) title-cases names. Watch its known weakness: it produces “O’neil” and “Mcdonald” because it only capitalizes letters after spaces. =LOWER(A2) is the right call for email addresses, which are case-insensitive in practice.
3. Phone and date formats: harder than they look
Excel has no built-in phone standardizer. The usual approach is stripping non-digits with nested SUBSTITUTE calls, then reformatting with TEXT:
=TEXT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2,"-",""),"(",""),")",""),".",""),"(000) 000-0000")
That handles the common cases and silently mangles anything with a country code or extension. Dates are worse: a column mixing real date serials with text dates (“Jan 3 2024”) cannot be fixed with one formula. This category is where a purpose-built tool earns its keep; our cleaning rules handle the format variants and show you a before/after preview instead of a formula error.
4. Duplicates: Remove Duplicates, carefully
Data > Remove Duplicates works only AFTER whitespace and capitalization are fixed, because Excel compares exact values. Run it last, never first, and always on a copy.
5. Numbers stored as text
Select the column, look for the green triangle warnings, and use Data > Text to Columns with no delimiters to force re-evaluation. For ZIP codes, format the column as Text BEFORE pasting data, or restore lost leading zeros with =TEXT(A2,"00000").
Excel functions vs. a cleaning tool
| Task | In Excel | With the free tool |
|---|---|---|
| Trim whitespace | =CLEAN(TRIM(A2)) + paste values | Automatic |
| Title-case names | =PROPER(A2), breaks on O’Neil | Automatic, apostrophes handled |
| Lowercase emails | =LOWER(A2) | Automatic |
| Standardize phones | Nested SUBSTITUTE formula | Automatic for US formats |
| Mixed date formats | No single-formula fix | Normalized to YYYY-MM-DD |
| Preview before applying | Manual side-by-side columns | Built-in diff with confidence score |
The honest version of this comparison: if your file is one column of names, Excel is faster. If it is a real export with phones, emails, and dates all messy at once, the formulas above become a 30-minute project with edge cases. The free tool handles CSV files up to 2MB or 2,500 rows, shows every proposed change before applying it, and exports a clean CSV. No account needed.
A safe cleaning workflow, in order
- Save a copy of the original file. Always.
- Fix whitespace first (TRIM/CLEAN). Everything downstream depends on it.
- Normalize capitalization (PROPER for names, LOWER for emails).
- Standardize formats (phones, dates, ZIPs).
- Remove duplicates last, once values are comparable.
- Spot-check 10 random rows against the original before you trust the result.
FAQ
Does TRIM remove all whitespace?
It removes leading and trailing spaces and collapses internal runs of spaces to one. It does not remove non-breaking spaces (CHAR(160)) that web exports often contain; use SUBSTITUTE(A2,CHAR(160),” “) first, then TRIM.
Why does PROPER break names like O’Neil?
PROPER only capitalizes the first letter after a non-letter character and lowercases the rest, so O’NEIL becomes O’neil. You have to fix those by hand, or use a tool whose name rules handle apostrophes and hyphens.
Is there a free tool that does all of this at once?
Yes. Our data cleaning tool is free for CSV files up to 2MB or 2,500 rows, requires no signup, and shows a before/after preview of every change. The rules it applies are documented on the How Cleaning Works page.
Leave a Reply