Check for valid of invalid emailaddresses with egrep

If you want to check if a file, like a csv-file with e-mail-addresses for a newsletter, contains invalid formatted email-addresses then you can use this egrep command.

egrep "\w+([._-]\w)*@\w+([._-]\w)*\.\w{2,4}" emial.csv -vn

Input file email.csv is a file containing all the email-addresses.
The output is a list of all invalid formatted email-addresses incuding the linenumers in the csv-file (-n option).

This code will generate a new file with only the correct formatted emails

egrep "\w+([._-]\w)*@\w+([._-]\w)*\.\w{2,4}" email.csv >correct_formatted_emails.cvs

This one is even better!!

Later I found this regex to check emailaddresses. thanks to Bilou MCgyver


Here a commandline usage for scanning an csv-file. One emailaddres per line.

egrep "^[A-Za-z0-9](([_\.\-]?[a-zA-Z0-9]+)*)@([A-Za-z0-9]+)(([\.\-]?[a-zA-Z0-9]+)*)\.([A-Za-z]{2,})$" emailaddresses_to_check.csv -vn

Edit 20141210
I was just reading WikiPedia and notices that the local part may contain this special character: ! # $ % & ' * + - / = ? ^ _ ` { | } ~ It's not usual to use this special characters in an email address but it happens and I found one in my list. So here is the new regex to validate email addresses.

egrep "^[A-Za-z0-9](([&_\~\|\{\}\`\^\?\=\/\+\*\'\%\$\#\!\^\.\-]?[a-zA-Z0-9]+)*)@([A-Za-z0-9]+)(([\.\-]?[a-zA-Z0-9]+)*)\.([A-Za-z]{2,})$" emailaddresses_to_check.csv -vn

The content of this field is kept private and will not be shown publicly.
Deze vraag is om te controleren of u een persoon bent. Di is om spam berichten tegen te gaan.