Ruby bindings for Google's Compact Encoding Detection (CED for short) C++ library
You will need CMake to build the C++ native extension.
macOS
You can use Homebrew to install it:
brew install cmake
Then you can install the gem from RubyGems.org.
Either add this to your Gemfile:
gem 'compact_enc_det', '~> 1.0'or run the following command to install it:
gem install compact_enc_det
Now you can detect the encoding via the CompactEncDet.detect_encoding
,
which is a thin wrapper around CompactEncDet::DetectEncoding
and MimeEncodingName
functions from the C++ library.
file = File.read("unknown-encoding.txt", mode: "rb") result = CompactEncDet.detect_encoding(file) result.encoding # => #<Encoding:Windows-1250> result.bytes_consumed # => 239 result.is_reliable? # => true
Any contributions are welcome! Feel free to open an issue or a pull request.
The google/compact_enc_det repository is linked as a Git submodule at ext/compact_enc_det/compact_enc_det
.
You need to clone the repository with
--recurse-submodules
flag:git clone --recurse-submodules [email protected]:cloudaper/compact_enc_det.git
Or initialize and update the submodule after cloning with the following commands:
git submodule init && git submodule update
Tests located at tests
use the minitest framework.
Run the tests via test Rake task:
rake test
The gem will be compiled to
lib/compact_enc_det/compact_enc_det.bundle
first.
This gem is released under MIT license, while the original Google's Compact Encoding Detection library source code,
located at ext/compact_enc_det/compact_enc_det
, is under the Apache-2.0 license.