Regular expression notes
- It is liberal within legal syntax. If you look at RFC
2822, there are a lot more legal characters than you might
expect. This regex allows them in the local part, regardless of
whether they are common. Likewise, the regex passes just about any
legal domain name, regardless of whether it's likely. The general
philosophy is to let most legal stuff slide through the
client-side validation and let the server-side validations check
for realism.
- It is strict when syntax is illegal. Dash characters
("-"), for example, are not allowed on the ends of the
domain labels. Also, domains cannot contain underscores
("_")... at least not domains that conform to the old
host name rules, which are what domains in email addresses must
follow these days.
- It does require at least a second-level domain and
conformance to host name standards (RFC
952 and RFC
1123).
- It allows only IPv4 domain literals.
- It does not allow display names, quoted literals, comments, or
whitespace. It basically allows a limited version of the addr-spec
form described in RFC 2822.
The DNS specification allows domain names to contain essentially
any character--even characters that cannot be displayed. In practice,
however, most software on the Internet expects domain names to conform
to the host name requirements described in RFC
952 and updated in RFC
1123.
In short, domain names that conform to the host name requirements
must contain only letters (upper or lower case) and digits. They may
also contain dash characters as long as the dashes are not on the ends
of the labels.
| abc.XYZ |
123.456 |
a-z.0-9 |
| a$%.^&z |
-12.45- |
|
|