The AjglCsvRfc component offers a drop in replacement for native PHP CSV related functions to read and/or write RFC4180 compliant CSV files.
The native PHP implementation contains a Wont fix bug #50686 when you try to write a CSV field which contains the
escape char (\
by default), followed by the enclosure char ("
by default).
The RFC 4180 states that:
If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote.
The CSV version of the string "Hello\", World!
should be """Hello\"", World!"
but it does not work as expected. You
can see a detailed explanation at https://3v4l.org/NnHp4
This package provides an alternative implementation to read and write well escaped CSV files for the following functions and methods:
Native | Alternative |
---|---|
fgetcsv |
Ajgl\Csv\Rfc\fgetcsv |
fputcsv |
Ajgl\Csv\Rfc\fputcsv |
str_getcsv |
Ajgl\Csv\Rfc\str_getcsv |
SplFileObject::fgetcsv |
Ajgl\Csv\Rfc\Spl\SplFileObject::fgetcsv |
SplFileObject::fputcsv |
Ajgl\Csv\Rfc\Spl\SplFileObject::fputcsv |
SplFileObject::setCsvControl |
Ajgl\Csv\Rfc\Spl\SplFileObject::setCsvControl |
To install the latest stable version of this component, open a console and execute the following command:
$ composer require ajgl/csv-rfc
The simplest way to use this library is to call the alternative CSV functions:
use Ajgl\Csv\Rfc;
$handler = fopen('php://temp', 'w+');
Rfc\fputcsv($handler, array('Hello \"World"!'));
rewind($handler);
$row = Rfc\fgetcsv($handler);
rewind($handler);
$row = Rfc\str_getcsv(fgets($handler));
If you prefer you can use the alternative implementation for SplFileObject
or SplTempFileObject
:
use Ajgl\Csv\Rfc;
$file = new Rfc\Spl\SplFileObject('php://temp', 'w+');
$file->fputcsv(array('Hello \"World"!'));
$file->rewind();
$row = $file->fgetcsv();
$file->rewind();
$file->setFlags(\SplFileObject::READ_CSV | \SplFileObject::READ_AHEAD | \SplFileObject::SKIP_EMPTY);
foreach ($file as $line) {
$row = $line;
}
Instead of using the alternative functions or classes, you can use the provided stream filter to fix the enclosure escape. You must register the stream filter (if not registered yet) and append it to your stream:
use Ajgl\Csv\Rfc;
Rfc\CsvRfcWriteStreamFilter::register();
$handler = fopen('php://temp', 'w+');
stream_filter_append(
$handler,
Rfc\CsvRfcWriteStreamFilter::FILTERNAME_DEFAULT,
STREAM_FILTER_WRITE
);
fputcsv($handler, array('Hello \"World"!'));
rewind($handler);
$row = fgetcsv($handler, 0, ',', '"', '"');
❮ NOTE ❯: The $escape_char
in fputcsv MUST be (if allowed) the default one (the backslash \
). The $enclosure
and $escape
parameters in fgetcsv MUST be equals.
By default, the enclosure character of the stream filter is a double-quote ("
). If you want to change it, you can
provide a custom enclosure character in two different ways.
An array with an enclosure
key can be provided when appending the filter to the stream:
use Ajgl\Csv\Rfc;
$enclosure = '@';
Rfc\CsvRfcWriteStreamFilter::register();
$handler = fopen('php://temp', 'w+');
stream_filter_append(
$handler,
Rfc\CsvRfcWriteStreamFilter::FILTERNAME_DEFAULT,
STREAM_FILTER_WRITE,
array(
'enclosure' => $enclosure
)
);
fputcsv($handler, array('Hello \"World"!'), ',', '@');
rewind($handler);
$row = fgetcsv($handler, 0, ',', '@', '@');
If the filter name starts with the special key csv.rfc.write.
you can define your custom enclosure character appending
it to the filtername:
use Ajgl\Csv\Rfc;
$enclosure = '@';
$filtername = 'csv.rfc.write.' . $enclosure;
Rfc\CsvRfcWriteStreamFilter::register($filtername);
$handler = fopen('php://temp', 'w+');
stream_filter_append(
$handler,
$filtername,
STREAM_FILTER_WRITE
);
fputcsv($handler, array('Hello \"World"!'), ',', '@');
rewind($handler);
$row = fgetcsv($handler, 0, ',', '@', '@');
❮ NOTE ❯: The enclosure character passed via parameters will override the one defined via filter name.
By default, the PHP CSV implementation uses LF
("\n"
) as EOL while writing a CSV row. These alternative functions
use LF
("\n"
) by default too.
But, the RFC 4180 states that:
Each record is located on a separate line, delimited by a line break (CRLF).
So, if you want to write RFC4180 compliant CSV, you can override the default EOL using:
use Ajgl\Csv\Rfc\CsvRfcUtils;
CsvRfcUtils::setDefaultWriteEol(CsvRfcUtils::EOL_WRITE_RFC);
To read the CSV data, this implementation leverages the PHP native capabilities to read files. If you are having any
problem with the EOL detection, you should enable the auto_detect_line_endings
configuration option as following the
PHP doc recomendation.
ini_set('ini.auto-detect-line-endings', true);
The well known league/csv
package provide a great object oriented API to work with CSV
data, but as long as it leverages the default PHP implementation for CSV, versions prior to 9.0 are affected by the #50686 bug.
You can use this component with league/csv <9.0
to produce RFC 4180 compatible files avoiding this bug.
To integrate this component with the league/csv
writer implementation, you can use the Stream Filter API.
use Ajgl\Csv\Rfc;
use League\Csv\Writer;
CsvRfcWriteStreamFilter::register();
$writer = Writer::createFromPath('/tmp/foobar.csv');
if (!$writer->isActiveStreamFilter()) {
throw new \Exception("The Stream Filter API is not active.");
}
$writer->appendStreamFilter(CsvRfcWriteStreamFilter::FILTERNAME_DEFAULT);
$writer->insertOne(array('"Hello\", World!'));
❮ NOTE ❯: Do not override the default writer escape character (\
).
- The
league/csv
package does not support the Stream Filter API when the writer instance is created from aSplFileObject
. - The
league/csv
implementation will always leverage the standard\SplFileObject::fputcsv
to write CSV data, so the fix to write RFC 4180 compatible files fromAjgl\Csv\Rfc\Spl\SplFileObject::fputcsv
will be ignored.
To read back the CSV data, you can leverage the standard implementation, but you MUST set the reader escape and enclosure characters to the same value.
use League\Csv\Reader;
$reader = Reader::createFromPath('/tmp/foobar.csv');
$reader->setEscape($reader->getEnclosure());
foreach ($reader as $row) {
//...
}
This component is under the MIT license. See the complete license in the LICENSE file.
Issues and feature requests are tracked in the Github issue tracker.
Developed with ♥ by Antonio J. García Lagar.
If you find this component useful, please add a ★ in the GitHub repository page and/or the Packagist package page.