Skip to content

The Email Address Validation Library

License

BSD-2-Clause, Unknown licenses found

Licenses found

BSD-2-Clause
LICENSE
Unknown
LICENSE.postfix
Notifications You must be signed in to change notification settings

gh0stwizard/libeav

Repository files navigation

libeav - Email Address Validation Library

build codecov

Description

This is a library which validates an Email Address. I do my best to support all existing RFCs:

A quick note about RFC 6531: it is possible that the correct implementation SHOULD take into account RFC 20 to exclude some characters: # ` { } | ~ ^. See the RFC 6531 switches section below.

What it does not

You have to be NOT mistaken about the definition of Email Address. This software does NOT implement (at the moment, at least) a full support of Address Specification.

All that this software does it is validating of the Addr-Spec Specification. See details below:

address = mailbox / group
mailbox = name-addr / addr-spec
addr-spec = local-part "@" domain

Features

  • UTF-8 / IDNA2008 support
  • FQDN & TLD check (configurable)
  • Local-Part & Domain length check

Dependencies

Build dependecies

General requirements:

To rebuild the documentation and auto-generate TLD database:

Thread-safety

The library SHOULD be thread-safe. Not tested yet.

Documentation

The documentation may be found either in docs directory or after installing this library via man command:

% man libeav

Build

Automatic

At the moment the build system looking for IDN libraries using next order:

  1. libidn2
  2. libidn
  3. idnkit - see how to configure below

Build manually with specific IDN library

In the case when you have installed libidn2 and libidn or idnkit (i.e. any 2 or more libraries) on the system, but you want to build libeav with specific library you have to tell about that like it shown below:

% make FORCE_IDN=idn
% make check FORCE_IDN=idn
% make install DESTDIR=/path/to/install/dir

The FORCE_IDN option can have the next values (self-described, I hope):

  • idn
  • idn2
  • idnkit

When there is no libidn2.pc file on the system (Debian 8, for instance), then you may do this:

% make FORCE_IDN=idn2 DEFS="-DHAVE_LIBIDN2" LIBS="-lidn2"

Build with idnkit

First of all, you have to specify the directory where idnkit was installed. To do this you have two options:

  • Export the IDNKIT_DIR environment variable as showed below.
  • Manually change paths in Makefiles.

The library assumes that idnkit was installed in /usr/local by default.

Secondly, if you planning to install the library to non-standard directory you may change the destination directory by setting DESTDIR variable manually. The default value of DESTDIR is /usr/local.

% export INDKIT_DIR=/home/user/local
% make
% make check
% make install DESTDIR=/path/to/install

If you have libidn2 or libidn installed, then you must use the FORCE_IDN option as show above.

RFC 6531 switches

It is possible to build the library to follow rules of RFC 5322 and RFC 20. To so, export or define inline the next options:

  • RFC6531_FOLLOW_RFC5322 = ON | OFF (default OFF)
  • RFC6531_FOLLOW_RFC20 = ON | OFF (default OFF)

For instance, define the options above inline:

% make clean
% make check RFC6531_FOLLOW_RFC5322=ON RFC6531_FOLLOW_RFC20=ON

Briefly about the switches and these RFCs:

  • RFC 5322 allows some CONTROL characters in quotes and quoted pairs. When RFC6531_FOLLOW_RFC5322 = OFF, libeav disallows any CONTROL characters within local-part (copy the RFC 5321 behavior).
  • RFC 20 simply disallows the next GRAPHIC characters within local-part: # ` { } | ~ ^.

See the code of is_6531_local.c file for details.

Underscores inside of the domain part

According to "2.3.1. Preferred name syntax" of RFC 1035, a mail domain should satisfy both the rules of the RFC 1035 and those in RFC 822.

Historically, many MX servers do allow underscores in the domain part of an email addresses. The libeav is based on the code base of the postfix, specifically, you may check this by looking at the file src/util/valid_hostname.c of the current postfix distribution.

Since libeav 1.2.0 the new option was added:

  • LABELS_ALLOW_UNDERSCORE = ON | OFF (default OFF)

Pass it to the make command, for instance:

% make FORCE_IDN=idn2 LABELS_ALLOW_UNDERSCORE=ON

Before this, libeav had allowed underscores inside of the domain part of the email addresses by default.

Also should be noted that idnkit prohibits underscores within the domain part of the email addresses. I'm highly recommending using LABELS_ALLOW_UNDERSCORE=OFF along side with idnkit. At the moment, libeav calls idnkit functions to check the domain part only when RFC6531 mode is on.

Updating TLD database

First, there are three files, which corresponds to IANA's Root Zone Database:

The data/tld-domains.txt depends on data/raw.csv. The auto_tld.c and auto_tld.h depends on data/punycode.csv.

Second, to update these files you need Perl and Text::CSV installed on your system.

Third, you have to obtain CSV files by yourself somehow. I am using the application written to this task, you may find it here: github: gh0stwizard/iana-tld-extractor.

To update data/tld-domains.txt use the command below:

% make tld-domains

To update src/auto_tld.c and include/eav/auto_tld.h use the next command:

% make auto

API

When compile your application with idnkit, please, pass the -DHAVE_IDNKIT flag to the compiler.

Please, read the developer documentation for details.

Tools

There is an utility called eav. It will be installed by default to the DESTDIR/bin directory. It has a simple usage:

% eav emails1.txt emails2.txt

See its sources in the bin directory.

The sample output is:

FAIL: localpart-only
      email has no domain part
PASS: i.have.only.ipv4@[8.8.8.8]
PASS: this-is.ipv6@[IPv6:2001:0db8:0000:0000:0000:ff00:0042:8329]
PASS: and.this.is.ipv6-ipv4@[IPv6:::ffff:192.0.2.128]
FAIL: invalid.ipv4@[0.1.2.3]
      invalid ip address
FAIL: invalid.ipv6@[IPv6:ge80:1::]
      invalid ip address
PASS: valid.ipv6@[IPv6:2001:db8:0:1:1:1:1:1]
PASS: legacy.valid.ipv6@[IPv6:2001:db8::1:1:1:1:1]
PASS: xxx.ipv6@[2001:db8:1:1:1:1:1:1]
PASS: [email protected]

Troubleshooting

libidn issues

The idn command depends on the value of CHARSET environment variable. Also, this environment variable affects the behaviour of the libidn library.

If you are getting Character encoding conversion error, then try to check and fix the CHARSET variable on your system. Depending on the type of the operating system, the correct value may be utf-8 or UTF-8, please, refer to the documentation of your system.

See also

Credits

License

This software is released under BSD 2-clause "Simplified" License.