Ceph File System Remote Sync Daemon
For use with a distributed Ceph File System cluster to georeplicate files to a remote backup server.
This daemon takes advantage of Ceph's rctime
directory attribute, which is the value of the highest mtime
of all the files below a given directory tree node. Using this attribute, it selectively recurses only into directory tree branches with modified files - instead of wasting time accessing every branch.
You must have a Ceph file system. rsync
, scp
, or similar must be installed on both the local system and the remote backup. You must also set up passwordless SSH from your sender (local) to your receiver (remote backup) with a public/private key pair to allow rsync to send your files without prompting for a password. For compilation, boost development libraries are needed. The binary provided is statically linked, so the server does not need boost to run the daemon.
- Install
- Initialize configuration file:
cephfssyncd -d
(This can be skipped if you installed from .rpm or .deb) - Edit according to Configuration:
vim /etc/ceph/cephfssyncd.conf
- Verify settings with dry run before seeding:
cephfssyncd -s -d
- Set up passwordless SSH from the sender to the receiver
- Enable daemon:
systemctl enable --now cephfssyncd
yum install https://github.com/45Drives/cephgeorep/releases/download/v1.2.13/cephgeorep-1.2.13-3.el7.x86_64.rpm
yum install https://github.com/45Drives/cephgeorep/releases/download/v1.2.13/cephgeorep-1.2.13-3.el8.x86_64.rpm
wget https://github.com/45Drives/cephgeorep/releases/download/v1.2.13/cephgeorep_1.2.13-3focal_amd64.deb
apt install ./cephgeorep_1.2.13-3focal_amd64.deb
wget https://github.com/45Drives/cephgeorep/releases/download/v1.2.13/cephgeorep_1.2.13-3bionic_amd64.deb
apt install ./cephgeorep_1.2.13-3bionic_amd64.deb
- Install Boost (libboost-dev) and Thread Building Blocks (libtbb-dev) development libraries
git clone https://github.com/45drives/cephgeorep
cd cephgeorep
git checkout tags/v1.2.13
make -j8
ormake -j8 static
to statically link librariessudo make install
- In the same directory as makefile:
sudo make uninstall
Default config file generated by daemon: (/etc/cephfssyncd.conf)
# local backup settings
Source Directory = # path to the ceph directory you want backed up
Ignore Hidden = false # ignore files beginning with "."
Ignore Windows Lock = true # ignore files beginning with "~$"
Ignore Vim Swap = true # ignore vim .swp files (.<filename>.swp)
# remote settings
Destination = # one or more backup targets (failover only)
# list of destinations can be space or comma separated and Destination can be
# defined multiple times to append more failover targets.
# Destination format: [[user@]host:][path]
# Destination = root@backup-gw1:/tank/backup,root@backup-gw2:/tank/backup
# daemon settings
Exec = rsync # program to use for syncing - rsync or scp
Flags = -a --relative # execution flags for above program (space delim)
Metadata Directory = /var/lib/cephgeorep/ # put metadata on the ceph cluster if
# you want to use pacemaker with
# redundant gateways
Sync Period = 10 # time in seconds between checks for changes
Propagation Delay = 100 # time in milliseconds between snapshot and sync
Processes = 4 # number of parallel sync processes to launch
Threads = 8 # number of worker threads to search for files
Log Level = 1
# 0 = minimum logging
# 1 = basic logging
# 2 = debug logging
# Propagation Delay is to account for the limit that Ceph can
# propagate the modification time of a file all the way back to
# the root of the sync directory.
You can also specify a different config file with the command line argument -c
or --config
, i.e. cephfssyncd -c /alternate/path/to/config.conf
. If you are planning on running multiple instances of cephfssyncd
with different config files, be sure to have unique paths for Metadata Directory
for each config.
* The Ceph file system has a propagation delay for recursive ctime to make its way from the changed file to the top level directory it's contained in. To account for this delay in deep directory trees, there is a user-defined delay to ensure no files are missed. This delay was greatly reduced in the Ceph Nautilus release, so a delay of 100ms is the new default. This was able to sync 1000 files, 1MB each, randomly placed within 3905 directories without missing one. If you find that some files are being missed, try increasing this delay.
Launch the daemon by running systemctl start cephfssyncd
, and run systemctl enable cephfssyncd
to enable launch at startup. To monitor output of daemon, run journalctl -u cephfssyncd -f
.
cephfssyncd
usage:
cephfssyncd Copyright (C) 2019-2021 Josh Boudreau <[email protected]>
This program is released under the GNU General Public License v2.1.
See <https://www.gnu.org/licenses/> for more details.
Usage:
cephfssyncd [ flags ]
Flags:
-c --config </path/to/config> - pass alternate config path
default config: /etc/ceph/cephfssyncd.conf
-d --dry-run - print total files that would be synced
when combined with -v, files will be listed
exits after showing number of files
-h --help - print this message
-n --nproc <# of processes> - number of sync processes to run in parallel
-o --oneshot - manually sync changes once and exit
-q --quiet - set log level to 0
-s --seed - send all files to seed destination
-S --set-last-change - prime last change time to only sync changes
that occur after running with this flag.
-t --threads <# of threads> - number of worker threads to search for files
-v --verbose - set log level to 2
-V --version - print version and exit
Alternate configuration files can be specified using the -c --config
flag, which is useful for running multiple instances of cephfssyncd on the same system. -n --nproc
, -q --quiet
, -t --threads
and -v --verbose
are used to override options from the configuration file. -s --seed
is used for sending every file to the destination regardless of how old the file is. -d --dry-run
will run the daemon without actually syncing any files to give the user an idea of how many files will be synced if actually ran. -d --dry-run
combined with -v --verbose
will also list all files that would be synced.
To have cron take care of when syncing happens, make sure that the systemd service is disabled (systemctl disable --now cephfssyncd
) and create a cron job entry to execute cephfssyncd --oneshot
. This can also be done with systemd timers if the systemd unit file is modified to pass the --oneshot
flag to cephfssyncd.
Cron example: sync every sunday at 8 AM.
0 8 * * 0 stdbuf -i0 -o0 -e0 cephfssyncd --oneshot |& ts '[%F %H:%M:%S]' >> /var/log/cephgeorep.log 2>&1
# ^ unbuffer output ^ call with oneshot ^ pipe into timestamp ^ append to log file ^ redirect stderr too
For use with backing up to aws s3 buckets, there is some special configuration to be done. The wrapper script s3wrap.sh
included with the binary release allows the daemon to work with s3cmd
seamlessly. Ensure s3cmd
is installed and configured on your system, and use the following example configuration file as a starting point:
# local backup settings
Source Directory = /mnt/cephfs # full path to directory to backup
Ignore Hidden = false # ignore files beginning with "."
Ignore Windows Lock = true # ignore files beginning with "~$"
Ignore Vim Swap = true # ignore vim .swp files (.<filename>.swp)
# remote settings
# the following settings *must* be left blank for use with s3wrap.sh
Destination =
# daemon settings
Exec = /opt/45drives/cephgeorep/s3wrap.sh # full path to s3wrap.sh
Flags = sync_1 # place only the name of the s3 bucket here
# the rest of settings can remain as default ##########
Metadata Directory = /var/lib/cephfssync/
Sync Period = 10 # time in seconds between checks for changes
Propagation Delay = 100 # time in milliseconds between snapshot and sync
Processes = 1 # number of parallel sync processes to launch
Threads = 8 # number of worker threads to search for files
Log Level = 1
With this setup, cephfssyncd
will call the s3cmd wrapper script, which in turn calls s3cmd put ...
for each new file passed to it by cephfssyncd
, maintaining the directory tree hierarchy.
- Windows does not update the
mtime
attribute when drag/dropping or copying a file, so files that are moved into a shared folder will not sync if their Last Modified time is earlier than the most recent sync. - When the daemon is killed with SIGINT, SIGTERM, or SIGQUIT, it saves the last sync timestamp to disk in the directory specified in the configuration file to pick up where it left off on the next launch. If the daemon is killed with SIGKILL or if power is lost to the system causing an abrupt shutdown, the daemon will resync all files modified since the previously saved timestamp.