Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible to filter-out some files? #165

Open
Barabba11 opened this issue Jul 30, 2021 · 8 comments
Open

possible to filter-out some files? #165

Barabba11 opened this issue Jul 30, 2021 · 8 comments

Comments

@Barabba11
Copy link

Hi, I'm trying to seduce writes on SD, while I would keep system logging what is happening for debug. There are some useless files full of zeros which are useless for me to store, can I create a list of undesired files that I don't want to save? For example *.dmp
Every time I do a sudo reboot now does the log2ram store before reboot?
Thanks

@xtvdata
Copy link

xtvdata commented Jul 30, 2021

I'd suggest to manage what is logged from rsyslog/syslog-ng configuration (which is the proper place to implement filters to what goes into logs).
Even better, you should modulate log level directly in the application that generates log.
The needs, at the same time, to log something and to not wanting it to survive a reboot is not clear to me. Moreover, a "dump" is not properly a log thus it should not be saved in /var/log but probably in another place.

Anyway, the main point is that I would rather avoid to overcomplicate the logic in log2ram when it's already possible to leverage other system tools already available. This to reduce potential maintenance issues and avoid duplicating the same type of logic in multiple places.

Possible solution:
If you really don't care about persisting those files and you cannot modulate log level, nor filter them via syslog, it would probably be better to create a dedicated Ramdisk (mounted somewhere, e.g. /tmp/volatile) and configure your application to save there logs, dumps, etc...
In addition you could also create a systemd service/timer to clear periodically /tmp/volatile-logs contents.
Ref.: Create a ramdisk on linux
Note: Tmpfs disks will use the required space, so if you save .dmp in /var/log with log2ram or /tmp/volatile-logs they will use the same amount of RAM and no more than is required in both cases.
P.S.: if your logs are generated via rsyslog/syslog-ng you can also decide to reroute a subset of the logs to /tmp/volatile-logs... (still I don't understand this need... even devices without persisted logs by design or hw limitation such as routers, should send their logs to a log server to persist them at least for sometime ;-) )

@Barabba11
Copy link
Author

Thank you a lot for your care and exhaustive explanation :)
Honestly I have no idea from where those files comes from. To debug it may take long time and unsuccessful (
I was thinking to exclude some files easily, maybe you kindly consider to implement that in future, thanks )
The idea to delete those files may be interesting, but it should be done just before log2ram starts storing, and I'm not sure it may affect/crash some processes.
I can't use another folder because I want to save syslog files, any other it may occur in log folder, I hope you understand.
About reboot, I mean, typing sudo reboot now, will log2ram store before Pi restarts? Or the Daemon is simply killed? Thanks

@xtvdata
Copy link

xtvdata commented Aug 1, 2021

Hi Barabba,
besides I'm not the author (@azlux is the author).

The chain to generate a log can be either:

Application > [create log in /var/log]

or

Application > [send log message to syslog] > rsyslog/syslog-ng > [save log to /var/log]

In the first case it's the the application's job to define both the log location and the granularity of the log.

In the second case, again, the application should manage the granularity of the log and rsyslog/syslog-ng should take care of the location (and optionally of the granularity, but if possible that should be managed only by the application).
The point is to avoid as much as possible distributing the responsibility to perform the same action (e.g. log granularity/filtering) between different entities, otherwise in case of problems you will not be able to understand clearly what configuration you should change.

log2ram comes in after those log chains and is responsible only to:

  • create a dynamic (eventually compressed) ramdisk
  • mount it at /var/log
  • periodically sync the content of the Ramdisk with the content of the actual disk

Regarding your unwanted log files: it should be quite easy to find what is creating them (from their names, *.dmp, they look more like data dumps or at least it looks they have a different purpose than logs and thus probably the proper place for them is a "backup directory" or "/tmp" and not with the system logs).
I see you running a RaspberryPi which usually is a Debian10. Now Debian standard install should not create any of those files (at least minimal server install or any application I know of), thus I believe they must came from something you installed...
I'd suggest to check what you installed and the configuration of that software.

Concerning your question about reboot: on any systemd standard event, such as start/stop/restarts, log2ram syncs the content of the Ramdisk to the /var/log in "/" (the SD or whatever drive on which the system is installed on).
On start the Ramdisk is empty thus the content of the disk is copied to the ram.
On stop the changed files in ram are copied to the disk.
The sync does not happen only in case of an hard crash of the kernel (where the system is locked and no operation can be actually performed).

P.S.: If your target is only to save the SD by reducing the number of files written at sync time, that will not change a lot: log2ram is already saving your SD; an additional file is just an single additional write to the disk once per day... (with the base configuration of log2ram). The issue with log writes to the SD is that normally there are multiple writes per second, one file written per day won't change significantly the life span of your SD an application like a database will do much more damage ;).

@azlux
Copy link
Owner

azlux commented Aug 3, 2021

Like @xtvdata have already explain, using the syslog engine help to filter logs.
Log2ram have no way to filter files.
This tool use mount point, so the full folder is managed without distinction.

Azlux

@Barabba11
Copy link
Author

thank you mates, it's much more clear now, I suggest you to include such infos (coopy this text will be enough) to the infos of the program, so in future there may be less questions like this.
About my issue, as I understood log2ram will not write every time all files present in temp but just new or the modification of existing files (in this case I think it will write all file if the size or date of last modified are mismatching, right? or it adds just the last bytes added comparing with backup?)
For example I have this:
Aug 1 23:55:02 RaspbFranco log2ram[20940]: sent 1,958,720 bytes received 18,495 bytes 3,954,430.00 bytes/sec
Aug 1 23:55:02 RaspbFranco log2ram[20940]: total size is 8,583,524 speedup is 4.34
Should I assume that the entire lof folder is about 2MB but were writtenon SD only 20KB?

About .dmp files, I have no idea which app creates them, I have node red with some modules, maybe them, I have python and other services installed, it can be everything.. but I understand that log2ram can't filter them out.

You are guru of Linux, do you know if exist any program that keeps tracking how many bytes are written to SD and which is the process generating them? This program should work in ram only.. and doesn't have to log on SD, In past I've used one (I don't remember it now) which caused the opposite problem, it was logging so much that in 3 months I got a fault on SD :(
Thank you a lot!

@xtvdata
Copy link

xtvdata commented Aug 4, 2021

Hi,
in order to perform the synchronization, log2ram leverages rsync (cp is only as a backup solution in case rsync is not available).
rsync copies to the destination only new files and files that have been changed since last synchronization.
Note: it is NOT a differential check bit-by-bit, but file-by-file. Doing it bit-by-bit would generate much more load on the system (and the disk) than it's required to just copy the whole file.

For your specific case.

nodeRed (or better, one of the installed modules) could be responsible for those files. If your nodeRed is managing a lot of messages an you have left in a module configuration a debug option enabled (or if one of those modules is in beta and has debug turned on by default due to the development stage) it might very well generate a huge amount of logs.
I'd suggest to check the nodeRed modules to see if one of them is in debug or is responsible for dumping data to files (you can probably skip all the standard modules or check them for a debug option as last resource).

To find what process is responsible for disk usage you can use atop, however it is not of immediate use. You need to install it, start the daemon and wait for it to collect data to analyze later and it includes lots of data therefore it's load is not negligible (not recommended in a production system). Also, if the issue is in one of the nodeRed's modules, the result could just generically point to the nodeRed process... which would just confirm that it's one nodeRed modules.

P.S.: in addition, remember that it is very important that you also set up correct rules for logrotate, which will prevent /var/log files to explode, and removing old data. If you install stuff that logs directly to files with non standard names (e.g. .dmp...) or to files placed in non standard directories (e.g.: /var/log - including subdirectories) those files will NOT be rotated automatically and you will have to add a rule in /etc/logrotate.d/ to manage them properly.

@Barabba11
Copy link
Author

Hi mate, thanks for lot of infos! What do you mean with "rotate"? You mean to be caked up inot the SD every X hours like log2ram is supposed to do? You mean then that only default files without subdirectories are backuped? Strange because I see subdirectories there..

About atop, once I've tried iotop and my SD went burned in about one month. My RB3 works really silently, there shouldn't processes that requires resources, there is only Node Red, and verbose logs are deactivated. Anyway this stuff write their logs on SD and this is pretty dangerous, they trace everything and generate lot of writes, when not configured properly. I suppose I'll avoid it and focus to what I can slim out.

I've another question, do you know if the NR file module is able always to write and read from var/logs? I want to implement an internal log, so I did a chmod 777 /var/log to drag a file with terminal, will this setting persist in future? or after reboot I'll have again a 755?
Thank you

@xtvdata
Copy link

xtvdata commented Aug 20, 2021

Hi,

Hi mate, thanks for lot of infos! What do you mean with "rotate"? You mean to be caked up inot the SD every X hours like log2ram is supposed to do? You mean then that only default files without subdirectories are backuped? Strange because I see subdirectories there..

logrotate is a standard Linux tool, installed in any distribution I know, which takes care of rotating logs. For each managed log file, logrotate periodically (e.g. every night or every week) renames that file adding “.1” to the name, if a .1 file already exists logrotate renames it to .2, and so on. Eventually, depending on logrotate configuration, it can also gzip the copies and delete old files (you can decide how many rotated files you want to keep before deletion).
Concerning subdirectories, standard config of logrotate rotates only standard log files and some files ending with “.log” (see logrotate config for more details). Usually you need to configure logrotate to manage additional logs that can appear also in subdirectories of /var/log. Moreover you can define a different archive directories for older logs to keep the main log dir more readable.

See here for some more details (Ubuntu and Debian are very similar): https://www.digitalocean.com/community/tutorials/how-to-manage-logfiles-with-logrotate-on-ubuntu-16-04

About atop, once I've tried iotop and my SD went burned in about one month. My RB3 works really silently, there shouldn't processes that requires resources, there is only Node Red, and verbose logs are deactivated. Anyway this stuff write their logs on SD and this is pretty dangerous, they trace everything and generate lot of writes, when not configured properly. I suppose I'll avoid it and focus to what I can slim out.

“atop” and “iotop” are not tools to run continuously, but should be activated ONLY to track issues origins. Besides killing the SD card, they also eat a fair amount of resources…
NodeRed is fantastic, and has solid engine, however many modules are not that good and some of them (even some of the most used ones) are still in beta (or even alpha) stage and can produce huge amount of logs (debug level hard coded or configured by default). I use NodeRed mysef and I’ve removed any module not maure enough, preferring to use less feature reach but more mature modules or even rewriting by myself what I really need that is not included in a module of decent quality (after all it’s just NodeJs JavaScript).

I've another question, do you know if the NR file module is able always to write and read from var/logs? I want to implement an internal log, so I did a chmod 777 /var/log to drag a file with terminal, will this setting persist in future? or after reboot I'll have again a 755?

NodeRed file access is constrained only by file permissions. You can write or read in any place of the file system, assuming that the user running NodeRed can access the location.
I’d advise against doing a chmod 777 of /var/log since it’s big security hole (in /var/log there are lots of sensitive information), if you need to, I’d suggest to use a sub directory like /var/log/my-nodered-logs and chown that dir only to the NodeRed user. After that you should configure a logrotate rule to “rotate” logs in /var/log/my-nodered-logs in order to avoid ever-growing logs.
Concerning permissions, all permissions of /var/log should be kept during sync, and thus it should survive a reboot, however if you change permissions or ownership of /var/log (the mount point of the RAM disk), well that could give some issues. But, again, you should NEVER change standard permission/ownership of Linux standard directories as that could lead to security issues and/or break standard functionalities.

P.S.: if you want to implement some kind of customized log in NodeRed you should also consider to send the messages to syslog via TCP-UDP port or even Unix socket. In this way the custom log entries would end in /var/log/syslog, but then you can also split it to a dedicated file by configuring a one line rule in rsyslog configuration. This would allow you to get a real standard logging flow that could even be rerouted easily to a separate server used to store logs (just need to add a forward rule to config of rsyslog and all entries will flow to the log server via local network, but is may vary a lot depending on the kind of log server you decide to use).

P.P.S.: burning an SD in one month seems a bit extreme… all of my SD cards lasted several years (I still have one RPi1 with its first SD card working…), however I’m moving to ssd as much as possible… if you write so much on the SD to burn it, probably the system load will be quite high just to manage the I/O (see top’s iowait).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants