Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetBSD irqstats plugin produces faulty output, causing storage blowout on server #1537

Open
als-git opened this issue May 11, 2023 · 1 comment

Comments

@als-git
Copy link

als-git commented May 11, 2023

Describe the bug
The bundled plugins/node.d.netbsd/irqstats.in for munin-2.0.69 produces incorrect output the includes the value
of an entry into the name of the entry, causing copious amounts of unique entries to appear. Was found when I was
wondering exactly why munin would (rapidly) consume 8+GB for < 100 nodes. This bug easily creates 10k+ files
in a few days because of that. Oh, and of course the data is utterly useless as well for the same reason.

The faulty output looks like this (short excerpt only):

intr_msix2_vec_0_____45173650.value 451736504
intr_msix2_vec_1______9604523.value 96045232
intr_msix2_vec_2_____10494045.value 104940458
intr_msix2_vec_3______6304586.value 63045862

The expected output would be this:

msix2_vec_0.value 451984907
msix2_vec_1.value 96049563
msix2_vec_2.value 105022424
msix2_vec_3.value 63064578

To Reproduce
Steps to reproduce the behavior:
Invoke the irqstats plugin on a sufficiently (observed with at least NetBSD 9.0, but I'm pretty sure I saw
it on earlier ones too) NetBSD machine (reproduced on: i386, amd64, sparc64), see the output:

intr_msix2_vec_0_____45173650.value 451736504
intr_msix2_vec_1______9604523.value 96045232
intr_msix2_vec_2_____10494045.value 104940458
intr_msix2_vec_3______6304586.value 63045862

Expected behavior
Expected output would look like this:

msix2_vec_0.value 451984907
msix2_vec_1.value 96049563
msix2_vec_2.value 105022424
msix2_vec_3.value 63064578

Screenshots & Logs
If applicable, please add screenshots and/or logs to help explain your problem.

Desktop (please complete the following information):
OS: NetBSD 9.1, NetBSD 9.2, NetBSD 9.3
Munin version: 2.0.69

Additional context

I've been running a rewritten version of the irqstats plugin for NetBSD for ... a few months at least now, this
one works and produces the expected output:

#! /bin/sh
# 
# Plugin to monitor the individual interrupt sources.
#
# Usage: Link or copy into /etc/munin/node.d/
#
# $Log: irqstats.in,v $
# Revision 1.1.1.1  2006/06/04 20:53:57  he
# Import the client version of the Munin system monitoring/graphing
# tool -- project homepage is at http://munin.sourceforge.net/
#
# This package has added support for NetBSD, via a number of new plugin
# scripts where specific steps needs to be taken to collect information.
#
# I also modified the ntp_ plugin script to make it possible to not
# plot the NTP poll delay, leaving just jitter and offset, which IMO
# produces a more telling graph.
#
#
#
# Magic markers (optional - only used by munin-config and some
# installation scripts):
#
#%# family=auto
#%# capabilities=autoconf

if [ "$1" = "autoconf" ]; then
    if [ -x /usr/bin/vmstat ]; then
        echo yes
        exit 0
    else
        echo no
        exit 1
    fi
fi

intr_sources=$(/usr/bin/vmstat -i|grep -v Total|grep -v 'total rate'|sed -E 's/ {2,}/|/g'|sed 's/ /_/g'|grep -e '[:alnum:]'|cut -d\| -f1)
echo "intr_sources = |$intr_sources|"

# If run with the "config"-parameter, give out information on how the
# graphs should look.

if [ "$1" = "config" ]; then

    echo 'graph_title Individual interrupts'
    echo 'graph_args --base 1000 -l 0'
    echo 'graph_vlabel interrupts / ${graph_period}'
    echo 'graph_category system'
    echo -n 'graph_order '
    for i in $intr_sources; do
        echo -n ' intr_'${i}
    done
    echo

    for i in $intr_sources; do
#       echo 'intr_'${i}'.draw LINE'
        echo 'intr_'${i}'.label' `echo $i | sed -e 's/_/ /g'`
        echo 'intr_'${i}'.info Interrupt' `echo $i | sed -e 's/_/ /g'`
        echo 'intr_'${i}'.type DERIVE'
        echo 'intr_'${i}'.min 0'
    done
    exit 0
fi

/usr/bin/vmstat -i|grep -v Total|grep -v 'total rate'|sed -E 's/ {2,}/|/g'|sed 's/ /_/g'|sed 's/|/ /g'|grep -E '[:alnum:]'|awk '{print $1 ".value " $2}'
wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this issue May 11, 2023
Copy the current state of sysutils/munin-node from pkgsrc as
prep for working on the package to locally fix the irqstats bug,
upstream: munin-monitoring/munin#1537

Signed-off-by: Alexander Schreiber <[email protected]>
wip-sync pushed a commit to NetBSD/pkgsrc-wip that referenced this issue May 11, 2023
The munin-node irqstats plugin produces invalid data by incorporating
the field value into the field name, leading to all kinds of annoyances

upstream bug: munin-monitoring/munin#1537

Signed-off-by: Alexander Schreiber <[email protected]>
@kenyon
Copy link
Member

kenyon commented Nov 23, 2023

If someone submits a pull request, eventually the downstream patch in NetBSD can be eliminated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants