Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help with using Bees.in setup script #254

Open
LeroyINC opened this issue Apr 3, 2023 · 8 comments
Open

Help with using Bees.in setup script #254

LeroyINC opened this issue Apr 3, 2023 · 8 comments

Comments

@LeroyINC
Copy link

LeroyINC commented Apr 3, 2023

I have bees successfully installed\built from source.

but I have been reading the documentation and I see info on how to configure Bees without the use of the bees.in script.

but I don't see any information on how to actually use the script.

--

do I even need to run it as a service?

what I would really like to do is juts manually run Bees as needed.. as my BTRFS file system has almost no changes happening.
Its juts an archive dump of old data. So even running it once a month or less to find duplicate blocks is more than enough.

any thoughts or guidance would be great.

@kakra
Copy link
Contributor

kakra commented Apr 3, 2023

Well, I'm using bees as a service without the wrapper script. You could use that as a starting point. It sets up some scheduling and resource parameters (which you probably don't need for your use case), and it statically adds the parameters which are otherwise dynamically created by the wrapper script:

# /etc/systemd/system/bees.service
[Unit]
Description=Bees
Documentation=https://github.com/Zygo/bees
After=local-fs.target
RequiresMountsFor=/mnt/btrfs-pool

[Service]
Type=simple
Environment=BEESSTATUS=%t/bees/bees.status
ExecStart=/usr/libexec/bees --no-timestamps --strip-paths --thread-count=6 --scan-mode=3 --loadavg-target=5 --verbose=5 /mnt/btrfs-pool
CPUSchedulingPolicy=idle
IOSchedulingClass=idle
IOSchedulingPriority=7
KillMode=control-group
KillSignal=SIGTERM
Nice=19
Restart=on-abnormal
ReadWritePaths=/mnt/btrfs-pool
RuntimeDirectory=bees
StartupCPUWeight=25
WorkingDirectory=/run/bees

# Runtime hardening
ProtectProc=invisible
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateNetwork=true
PrivateIPC=true
ProtectHostname=true
ProtectKernelTunables=true
ProtectControlGroups=true
AmbientCapabilities=CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_SYS_ADMIN
NoNewPrivileges=true

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/bees.service.d/override.conf
[Service]
Slice=maintenance.slice
CPUSchedulingPolicy=batch
IOWeight=10
StartupIOWeight=25
MemoryLow=2G

So you could simply run the ExecStart command in your environment and maybe adjust some settings like nice, scheduling policy or logging.

@LeroyINC
Copy link
Author

LeroyINC commented Apr 4, 2023

Thanks for the info... i tried this out and i got the following error when trying to start the service.

[root@HOSTNAME system]# systemctl status bees
× bees.service - Bees
Loaded: loaded (/usr/lib/systemd/system/bees.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2023-04-04 17:45:47 EDT; 1min 16s ago
Duration: 58ms
Docs: https://github.com/Zygo/bees
Process: 8148 ExecStart=/usr/bin/bees /var/lib/bees/f56b314c-db00-47ab-b3bb-222ed702682d (code=exited, status=1/FAILURE)
Main PID: 8148 (code=exited, status=1/FAILURE)
CPU: 59ms

Apr 04 17:45:47 HOSTNAME bees[8148]: 2023-04-04 17:45:47 8148.8148<6> bees: set_root_path /var/lib/bees/f56b314c-db00-47ab-b3bb-222ed702682d
Apr 04 17:45:47 HOSTNAME bees[8148]: 2023-04-04 17:45:47 8148.8148<6> bees: set_root_fd /var/lib/bees/f56b314c-db00-47ab-b3bb-222ed702682d
Apr 04 17:45:47 HOSTNAME bees[8148]: 2023-04-04 17:45:47 8148.8148<5> bees:
Apr 04 17:45:47 HOSTNAME bees[8148]: 2023-04-04 17:45:47 8148.8148<5> bees:
Apr 04 17:45:47 HOSTNAME bees[8148]: 2023-04-04 17:45:47 8148.8148<5> bees: *** EXCEPTION ***
Apr 04 17:45:47 HOSTNAME bees[8148]: 2023-04-04 17:45:47 8148.8148<5> bees: exception type std::system_error: openat: /var/lib/bees/f56b314c-db00-47ab-b3bb-222ed702682d / .beeshome at bees-context.cc:179: No such file or directory
Apr 04 17:45:47 HOSTNAME bees[8148]: 2023-04-04 17:45:47 8148.8148<5> bees: ***
Apr 04 17:45:47 HOSTNAME bees[8148]: 2023-04-04 17:45:47 8148.8148<5> bees: Exiting with status 1 (failure)
Apr 04 17:45:47 HOSTNAME systemd[1]: bees.service: Main process exited, code=exited, status=1/FAILURE
Apr 04 17:45:47 HOSTNAME systemd[1]: bees.service: Failed with result 'exit-code'.
[root@HOSTNAME system]#
[root@HOSTNAME system]#
[root@HOSTNAME system]#


also here is my service config file

[Unit]
Description=Bees
Documentation=https://github.com/Zygo/bees
After=local-fs.target
RequiresMountsFor=/var/lib/bees/f56b314c-db00-47ab-b3bb-222ed702682d

[Service]
Type=simple
Environment=BEESSTATUS=/var/lib/bees/bees.status
ExecStart=/usr/bin/bees /var/lib/bees/f56b314c-db00-47ab-b3bb-222ed702682d
CPUSchedulingPolicy=idle
IOSchedulingClass=idle
IOSchedulingPriority=7
KillMode=control-group
KillSignal=SIGTERM
Nice=19
Restart=on-abnormal
ReadWritePaths=/var/lib/bees/f56b314c-db00-47ab-b3bb-222ed702682d
RuntimeDirectory=bees
StartupCPUWeight=25
WorkingDirectory=/var/lib/bees/

Runtime hardening

ProtectProc=invisible
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateNetwork=true
PrivateIPC=true
ProtectHostname=true
ProtectKernelTunables=true
ProtectControlGroups=true
AmbientCapabilities=CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_SYS_ADMIN
NoNewPrivileges=true

[Install]
WantedBy=multi-user.target

@kakra
Copy link
Contributor

kakra commented Apr 4, 2023

I'm mounting subvolid=0 in /mnt/btrfs-pool. You need to setup a mount point in /etc/fstab matching your service configuration (or use something more accessible like I do):

# fgrep btrfs-pool /etc/fstab
LABEL=system /mnt/btrfs-pool btrfs noauto,noatime,compress=zstd,subvolid=0,x-systemd.automount

Previously, the wrapper script would have tried to setup the mount. But if you're not using it, you'll have to mount it via fstab.

@LeroyINC
Copy link
Author

LeroyINC commented Apr 5, 2023

thanks for the info and help.. appreciate it. I decided to change my plan and run it as a service using the scripts.

i created the directory /run/bees and /etc/bees
i edited the beesd.conf file and put it in /etc/bees directory
i also was able to successfully run the script "sh beesd --verbose 5 f56b314c-db00-47ab-b3bb-222ed702682" without errors

but i don't see where it created the service... or do i just manually copy the [email protected] files somewhere?

not sure where they should go.

@LeroyINC
Copy link
Author

LeroyINC commented Apr 5, 2023

update:

i copied the "[email protected]" file to the /usr/lib/systemd/system folder
i had to edit this file because the ExexStart= line had the wrong location to the bees binary

the i ran systemctl daemon-reload
then i ran "systemctl enable --now beesd@f56b314c-db00-47ab-b3bb-222ed702682d"

but now i have this issue: (but getting closer)

[root@HOSTNAME scripts]# systemctl status beesd@f56b314c-db00-47ab-b3bb-222ed702682d
× [email protected] - Bees (f56b314c-db00-47ab-b3bb-222ed702682d)
Loaded: loaded (/usr/lib/systemd/system/[email protected]; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2023-04-04 22:19:13 EDT; 1min 32s ago
Duration: 160ms
Docs: https://github.com/Zygo/bees
Process: 5225 ExecStart=/usr/bin/bees --no-timestamps f56b314c-db00-47ab-b3bb-222ed702682d (code=exited, status=1/FAILURE)
Main PID: 5225 (code=exited, status=1/FAILURE)
CPU: 61ms

Apr 04 22:19:13 HOSTNAME bees[5225]: bees[5225]: setting root path to 'f56b314c-db00-47ab-b3bb-222ed702682d'
Apr 04 22:19:13 HOSTNAME bees[5225]: bees[5225]: set_root_path f56b314c-db00-47ab-b3bb-222ed702682d
Apr 04 22:19:13 HOSTNAME bees[5225]: bees[5225]:
Apr 04 22:19:13 HOSTNAME bees[5225]: bees[5225]:
Apr 04 22:19:13 HOSTNAME bees[5225]: bees[5225]: *** EXCEPTION ***
Apr 04 22:19:13 HOSTNAME bees[5225]: bees[5225]: exception type std::system_error: open: name 'f56b314c-db00-47ab-b3bb-222ed702682d' mode 777 flags O_CLOEXEC|O_DIRECTORY|O_LARGEFILE|O_NOATIME|O_NOCTTY|O_NOFOLLOW|O_NONBLOCK at fd.cc:264: No>
Apr 04 22:19:13 HOSTNAME bees[5225]: bees[5225]: ***
Apr 04 22:19:13 HOSTNAME bees[5225]: bees[5225]: Exiting with status 1 (failure)
Apr 04 22:19:13 HOSTNAME systemd[1]: [email protected]: Main process exited, code=exited, status=1/FAILURE
Apr 04 22:19:13 HOSTNAME systemd[1]: [email protected]: Failed with result 'exit-code'.

@kakra
Copy link
Contributor

kakra commented Apr 5, 2023

i copied the "[email protected]" file to the /usr/lib/systemd/system folder
i had to edit this file because the ExexStart= line had the wrong location to the bees binary

Don't copy these files, create a directory /etc/systemd/system/[email protected] instead, and there create a file path-fix.conf (the name is actually free to choose). You can automate this with systemctl edit [email protected], it will automatically create the directory and an override file.

In this file, create (and only these lines):

[Service]
ExecStart=
ExecStart=<CORRECTEDCALLHERE>

The empty ExecStart= declaration is important to clear out the existing ExecStart declarations - otherwise it would only add another execution command. If using systemctl edit, it will also show the existing config commented out, you can simply copy lines and adjust them, it's actually very comfortable.

systemctl edit --full will automatically copy the full file and edit it (not recommended except you exactly know why you need it). After systemctl edit, there's no need for reload, it will be done automatically, and also show if there are syntax errors.

This way, your own adjustments will always derive from the system-installed service file, even after updates, and updates won't override your local adjustments.

The problem you're facing is probably due to mount namespacing: The service is quite heavily locked down and does see only a few specific writable directories, maybe even in different locations than your host system. It expects the btrfs root mounted within /run/bees which will be automatically setup by systemd because of RuntimeDirectory=bees (and can be found in the %t placeholder).

You can use systemctl cat to show all applied systemd configuration data, and systemctl show to show all combined effective settings (I think this might need the service running).

@LeroyINC
Copy link
Author

LeroyINC commented Apr 6, 2023

Ok I think I have most of this figured out.. thanks for the assist here.

one final question.. my system has very little new writes.. so i am not too worried about write performance.
but my file system is large... 64TB in size. so I am interested in the scanning to run as fast as possible...

the bees command line options are pretty straight forward..
except for the --loadavg-target LOADAVG or -g
when a number r is used here what does this mean?

is it a percentage of CPU... so 100 would mean to keep all cores busy all the time?
or is it some number related to disk usage.

what "system load" is it looking at? what number would i use to max out my system and scan as aggressively as possible?

also how do I know if scanning is completed or how much has been scanned?

@kakra
Copy link
Contributor

kakra commented Apr 6, 2023

loadavg is a standard (and very classic) system metric in Linux and Unix-likes. Traditionally, on Unix, it counted the average number of processes waiting in the scheduler queue and ready to run (state "runnable").

I think (but I may be wrong here), Linux later (hey, but probably in the 90s) added to that counter the number of processes waiting on resources other than just CPU (state "disk sleep"), OTOH maybe it had it from the beginning of its life time, I don't know. Technically, it's the average length of the scheduler run queue over a period of time today, counting processes in runnable and disk sleep states.

So if your system has 4 cores, and your loadavg hovers around 4, it means you are optimally using the resources because (for CPU-bound tasks at least) there's always at least one process waiting to be run immediately per core. Go beyond that, and your system will go slower as it could be because it's starving of resources, stay below, and your system is able to immediately run the next process. Think of it as: higher loadavg = higher perceived latency of operation (although it doesn't measure latency but queue length which can mean any amount of random latency).

The reality is a little different because loadavg does not count only a single resource (CPU, or IO, or memory), it's a mixed bag. But the general idea still holds true: If your system has 8 cores, run bees at a limit of 4 to make it hog a maximum fair share to 50% resources (whatever that mixture is). In a more practical manner, look at your loadavg while you're doing the usual stuff on your machine, and then set bees one or two numbers above that limit: This generally sets a more realistic value for bees to actually make progress without disturbing your workflow or performance. If you see your system blocking for IO, lower the bees limit a little bit, and if you feel like bees could do more without disturbing, raise it a little bit. (see /proc/loadavg)

The loadavg is measured by three values in this order: short average (last minute), medium average (last 5 min), long average (last 15 min). By these three values, you can determine if your load is trending up, down, or stays around the same. Modern kernels have some more values not of importance here: runnable processes (length of run queue) slash total scheduling entities, youngest PID. "runnable processes" are processes not waiting for a resource and scheduled for execution on the CPU but currently waiting in queue to be run.

Modern Linux has a better measurement for resource usage, called PSI: pressure stall information. It will measure, in percent, how much time the system has been partially blocking processes (some), or blocked all processes (full, not available for CPU pressure for obvious reasons) when waiting on a particular resource, again in average over a period of time (10s, 60s, 300s) plus total number of microseconds waiting on a resource. If you read the values multiple times, you can use the total microseconds delta to compute the average wait time using the percentage values. (see /proc/pressure/{cpu,io,memory})

also how do I know if scanning is completed or how much has been scanned?

You really never know, the job as implemented in bees is designed as an endless job. But you can look at the generation number of your subvolumes before starting bees, and then run bees until its state file reaches this number. Technically, it has done a full pass then. But practically, it has created a lot new writes, and your system has created a lot new writes, thus resulting in a higher current generation number of the subvolume, so bees hasn't really seen all the data it could have seen while it was running.

BTW: While any IO intensive task is active, IO is the dominating contributor to loadavg, so it is a good value for bees to throttle itself. In this case, you want to choose a value that more likely accounts for the number of spindles/disks rather than the number of CPU cores - so maybe choose something in between those two values (this pretends that all your data is optimally and equally distributed across all spindles for the access patterns of bees which is practically never the case). Or choose something like half of your spindles... You may need to experiment a little bit but I think you got the idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants