This gem finds exact duplicate files inside given directories and all of each sub directories. The result of the execution gets stored in a file called crf_log.txt. The execution time depends on the amount of files and each size, so be careful (or patient). You have options to run an approximated version of the algorithm which is faster but more inaccurate.
Add this line to your application's Gemfile:
gem 'crf'
And then execute:
$ bundle
Or install it yourself as:
$ gem install crf
After installing the gem, you can use it in your command line:
crf PATH_1 PATH_2 .. [-f] [-n] [-o]
Or you can use it in any ruby code you want:
require 'crf'
path = './test'
options = { interactive: true, progress: true, fast: false }
crf_checker = Crf::Checker.new([paths], options)
crf_checker.check_repeated_files
The -f, --fast option only checks if the files have the same size (is faster but it does not mean that the files are duplicates).
The -n, --no-interactive option will save the first file of the repetitions and remove the rest of the duplicates without asking.
The -o, --no-progress option will make CRF run without showing the progress bar.
The default version compares the size and SHA256 checksums of the files (which is more than enough in most cases). When using the crf command directly on the command line the interactive and progress bar options are enabled by default. But, when using the class directly on ruby code, these options are disabled by default.
- Fork it
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Run rubocop lint (
rubocop -R --format simple
) - Run rspec tests (
bundle exec rspec
) - Push your branch (
git push origin my-new-feature
) - Create a new Pull Request