Solved Email validation problem

Hello,

I have been using the script below for some time now and so far I thought the script was good..
Yesterday, however I had to type and email address that ended with co.uk and the script couldn't come out of the email validation loop..

Could someone please help me to find a way around it?
I seen example using regular expression but I am struggling in understanding it it work/learning it.

Code:
#!/bin/sh
#
# Set email varialble
email=""
confirm_email=""

while true; do
  read -p "Enter wp admin email : " email
  read -p "Comfirm email  : " confirm_email
  ## Do they match?
  if [ "$email" != "$confirm_email" ]; then
  echo "These email addresses don't match. Try again?"
  continue;
  fi
  ## Is it a valid email address?
  echo "$email" | grep '^[a-zA-Z0-9]*@[a-zA-Z0-9]*\.[a-zA-Z0-9]*$' >/dev/null 2>&1
  if [ $? -ne 0 ]; then
  echo "Email address isn't valid. Try again?"
  continue;
  fi
  break;
done
 
Start learning Regular Expressions or avoid them for ever.
Yes on my list of things to learn..
I saw friend shaving up to 10 lines of code by using them... Really got me thinking.
But I do find it difficult to get to term with...
 
Use host(1).
Code:
  OIF=$IFS
  IFS="@"
  set -- $email
  IFS=$OIF
  case $# in
  1) ;;
  2) host -- "$2" || continue ;;
  *) echo must be name or name@domain; continue ;;
  esac

Juha
 
That's clever use of IFS and host(1) but that will reject host names that are not working at the time of the script is running because of a transient failure, name server maintenance for example. You want to be able to accept mail for domains that are broken at the time of the mail is taken for delivery but will be fixed very soon.
 
Valid point. I got the impression this is to edit WordPress configuration, not to deliver, but still valid.

It does not want to play ball:
Code:
hopo $ host a.com
Host a.com not found: 3(NXDOMAIN)
hopo $ echo $?
1
hopo $ host .com
host: '.com' is not a legal name (Empty label)
hopo $ echo $?  
1

Juha
 
Code:
  ## Is it a valid email address?
  echo "$email" | grep '^[a-zA-Z0-9]*@[a-zA-Z0-9]*\.[a-zA-Z0-9]*$' >/dev/null 2>&1
The regexp assumes valid domains only have a single dot in them, i.e. example.com. It also accepts email addresses like test@.com and @example.com (no user part), which is obviously wrong. It even accepts an email address like @. which is very, very wrong.

Code:
echo "$email" | grep '^[a-zA-Z0-9]+@([a-zA-Z0-9]+\.)+[a-zA-Z0-9]+$' >/dev/null 2>&1
This should work better. It will still allow some blatantly wrong email addresses though.
 
You can't validate an email address using a regular expression because it doesn't follow a regular grammar. This already holds for the domain part, see the spec in BNF (which can describe any context-free grammar, not just a regular one):
Code:
<domain> ::= <subdomain> | " "
<subdomain> ::= <label> | <subdomain> "." <label>
<label> ::= <letter> [ [ <ldh-str> ] <let-dig> ]
<ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str>
<let-dig-hyp> ::= <let-dig> | "-"
<let-dig> ::= <letter> | <digit>
<letter> ::= any one of the 52 alphabetic characters A through Z in
upper case and a through z in lower case
<digit> ::= any one of the ten digits 0 through 9

This is a common problem, people often attempt to verify something using an RE that's just not regular and the solutions keep growing in complexity while they will always have subtle bugs.

IMHO, the best approach for verifying email addresses: just don't. Attempt to send a mail and see if it works.

The second best approach would be, as already mentioned, not reinventing the wheel but use something that's already there. In code, it's at least possible to write a "perfect" syntax checker.
 
Any special reason to not allow users like user.name@ user-name@ user_name@ or user+name@ in? I can understand that you may not have time, mood, whatever to dig into regular expressions, however it is hard to give more help than is already available one phrase away for theme which has been beaten to the death so many times before.
 
I agree, don't validate email addresses on input. Let the mail handling utilities and MTAs to figure out if the address is valid. There is so much complexity behind an apparently simple system that no pre-validation can cover all the possible cases of valid email addresses. To elaborate what ondra_knezour said, email addresses can contain special marker chars (that are configurable!) that form email addresses with special suffixes (tags) such as me+foo@mydomain.tld. That would be equal to me@mydomain.tld in terms of email transport but can be handled specially by the delivery agent.
 
Go and buy Jeffrey Friedl's Mastering Regular Expressions, the owl book. Yes, a physical paper copy of it. Used copies can be had for under $10.

This is probably the best O'Reilly book I have ever had. It takes a complicated subject and makes it amazingly clear and surprisingly fun to read.

In the back of this book is a regular expression for validating email addresses. It is 6,598 bytes long. It is not meant to really be used, just an example of how hard it is to really validate an email address.
 
In the back of this book is a regular expression for validating email addresses. It is 6,598 bytes long. It is not meant to really be used, just an example of how hard it is to really validate an email address.
This is either only partially correct (in that it doesnt reject correct email addresses, but accepts a tiny amount of wrong ones) or it uses RE extensions like look-ahead that allow to match a BIT more than only strict regular grammars .... or both ;)
 
You might run host ${email#*@} in the script, before using the input data, just as a courtesy, not testing or acting on the success/failure any way.

Juha
 
Back
Top