Regular expression notes

Email Address 1

  • It is liberal within legal syntax. If you look at RFC 2822, there are a lot more legal characters than you might expect. This regex allows them in the local part, regardless of whether they are common. Likewise, the regex passes just about any legal domain name, regardless of whether it's likely. The general philosophy is to let most legal stuff slide through the client-side validation and let the server-side validations check for realism.
  • It is strict when syntax is illegal. Dash characters ("-"), for example, are not allowed on the ends of the domain labels. Also, domains cannot contain underscores ("_")... at least not domains that conform to the old host name rules, which are what domains in email addresses must follow these days.
  • It does require at least a second-level domain and conformance to host name standards (RFC 952 and RFC 1123).
  • It allows only IPv4 domain literals.
  • It does not allow display names, quoted literals, comments, or whitespace. It basically allows a limited version of the addr-spec form described in RFC 2822.

Host name requirements

The DNS specification allows domain names to contain essentially any character--even characters that cannot be displayed. In practice, however, most software on the Internet expects domain names to conform to the host name requirements described in RFC 952 and updated in RFC 1123.

In short, domain names that conform to the host name requirements must contain only letters (upper or lower case) and digits. They may also contain dash characters as long as the dashes are not on the ends of the labels.

OK: abc.XYZ 123.456 a-z.0-9
Not OK: a$%.^&z -12.45-  
Show site map

contact us    privacy policy    legal information