All Apps and Add-ons

Nagios Linux Performance Graphs

jbaileyicw
New Member

I'm not getting anything for Memory Usage. What should the plugin output look like? I think that's my problem.

Tags (1)
0 Karma

mkeys
New Member

Luke,

You 'da man! Both pnp4nagios and your app are working in conjunction now. The linux performance graphs I spoke of earlier are still empty but I'm sure it's something simple I'm missing.

On a related note I noticed a new bug while re-checking everything. When I click "Livestatus Dashboard", the top line (host up-down-unreachable) are populates correctly but the next line (services ok-warning-critical-unknown) have 0s for everything. If I click the Livestatus Dashboard tab again to reload the dashboard it will then populate the total service numbers (1092) correctly. I'm not sure if it's just a delay in mk or what.

0 Karma

lukeh
Contributor

Hi Matt 🙂

The nagios.log file contains alerts and notifications etc but the performance data is logged to a separate file, either:

"service-perfdata" to be ingested into splunk with a sourcetype of "nagiosserviceperf"
 or
"splunk-nagios-perfdata" to be ingested into splunk with a sourcetype of "nagiosperfdata"

Use the latter if using pnp4nagios 🙂

You'll need to update the pnp4nagios script to output the performance data to an additional log file and ingested the new file into splunk with a new sourcetype. This way the performance data log format does not need to change and your rrd graphs will continue to work.

1/ Update the 'Bulk Mode' section within "process_perfdata.pl" as follows:

        print_log( "reading $pdfile for bulk update", 2 );
        open (SPLUNK, '>>/opt/nagios/var/splunk-nagios-perfdata');
        open( PDFILE, "< $pdfile" );
        my $count = 0;
        while (<PDFILE>) {
            $count++;
            print_log( "Processing Line $count", 2 );
            my @LINE = split(/\t/);
            %ENV = ();    # cleaning ENV
            foreach my $k (@LINE) {
                $k =~ /([A-Z 0-9_]+)::(.*)$/;
                $ENV{ 'NAGIOS_' . $1 } = $2 if ($2);
            }
            print SPLUNK "$_\n";
            if ( $ENV{NAGIOS_SERVICEPERFDATA} || $ENV{NAGIOS_HOSTPERFDATA} ) {
                parse_env();
                process_perfdata();
            }
            else {
                print_log( "No Perfdata. Skipping line $count", 2 );
            }
        }

        print_log( "$count Lines processed", 1 );

        if ( unlink("$pdfile") == 1 ) {
            print_log( "$pdfile deleted", 1 );
        }
        else {
            print_log( "Could not delete $pdfile:$!", 1 );
        }

    }
    else {
        print_log( "ERROR: File $opt_b not found", 1 );
    }
close (SPLUNK);
}

Note: only the following three new lines should be added to your existing script:

open (SPLUNK, '>>/opt/nagios/var/splunk-nagios-perfdata');
print SPLUNK "$_\n";
close (SPLUNK);

Replace /opt/nagios with the relevant path for your installation 🙂

2/ Update "$SPLUNK_HOME/etc/apps/SplunkForNagios/default/props.conf" with the following new sourcetype:

[nagiosperfdata]
EXTRACT-datatype = DATATYPE::(?P<datatype>[^\t]*)
EXTRACT-src_host = HOSTNAME::(?P<src_host>[^\t]*)
EXTRACT-name = SERVICEDESC::(?P<name>[^\t]*)
EXTRACT-result = SERVICEPERFDATA::(?P<result>[^\t]*)
EXTRACT-process = SERVICECHECKCOMMAND::(?P<process>[^\t]*)
EXTRACT-hoststate = HOSTSTATE::(?P<hoststate>[^\t]*)
EXTRACT-hoststatetype = HOSTSTATETYPE::(?P<hoststatetype>[^\t]*)
EXTRACT-state = SERVICESTATE::(?P<state>[^\t]*)
EXTRACT-statetype = SERVICESTATETYPE::(?P<statetype>\w+)
SHOULD_LINEMERGE = false
TIME_PREFIX = TIMET::

3/ Add the new file "splunk-nagios-perfdata" to be ingested into splunk with a sourcetype of "nagiosperfdata"

4/ Update the dashboards in "$SPLUNK_HOME/etc/apps/SplunkForNagios/default/data/ui/views" and change any occurance of sourcetype="nagiosserviceperf" to sourcetype="nagiosperfdata"

All the best,

Luke 🙂

P.S. The 'CURRENT SERVICE STATE' events are logged to nagios.log at midnight everyday, ie. as they are only logged just once per day they cannot be used for creating performance graphs, hence the requirement to ingest the performance data from the specific log file.

0 Karma

mkeys
New Member

Luke,

Thanks for the pointers, at least I'm looking in the right spot now lol. I'm using the official pluggins for the most part. Let's use Zombie Processes for an example. In the NagiosLinuxPerformanceGraphs.xml I've got:

        <module name="HiddenPostProcess" layoutPanel="panel_row4_col2" group="Zombie Processes" autoRun="False">
          <param name="search">timechart span=5m last(Zombies) as Zombies</param>
          <param name="groupLabel">Zombie Processes</param>

The check is outputting the following (snip from /usr/local/nagios/var/nagios.log) :

[1323234000] CURRENT SERVICE STATE: somehost;Zombie Processes;OK;HARD;1;PROCS OK: 0 processes with STATE = Z

From your reply I'm thinking I need to change last(Zombies) but I don't know what it should be. Since there's usually 0 of them, this may be a bad example. 🙂 Moving on to total processes the xml has:

        <module name="HiddenPostProcess" layoutPanel="panel_row4_col1" group="Total Processes" autoRun="False">
          <param name="search">timechart span=5m avg(Processes) as Processes</param>
          <param name="groupLabel">Total Processes</param>

The check outputs :

[1323234000] CURRENT SERVICE STATE: pvirtuadb;Total Processes;OK;HARD;1;PROCS OK: 167 processes

Thanks again,
Matt

0 Karma

lukeh
Contributor

Hi Matt 🙂

I have updated the doco because it was too vague and misleading, here are the new instructions:
Using your favourite xml editor, update these dashboards with the relevant label/key names for the specific performance data that are in use in your nagios environment. eg. change avg(CpuSystem) to avg(cpu_system) if your performance data for CPU Usage is labelled differently.
Dashboard location: $SPLUNK_HOME/etc/apps/SplunkForNagios/default/data/ui/views

ie. you don't have to change the service_description for any of your nagios checks, you just need to update the dashboards in Splunk for Nagios with your specific performance data label/key names.

Feel free to provide examples of your nagios performance data if you would like further assistance and I can recommend the relevant updates to make to your dashboards.

All the best,

Luke 🙂

mkeys
New Member

Luke,

I'm having the same issue with CPU Usage, Network Utilization, Zombie Processes, Total Processes, Swap Usage, and Memory Free/total in the "Nagios Linux Performance Graphs" (I haven't started the Windows side yet). You suggest above to "rename the "name" value in the dashboard to the relevant name of your nagios plugin". Would this be in $SPLUNK_HOME/etc/apps/SplunkForNagios/default/props.conf? My props.conf for CPU and Zombie processes for example has:

[nagiosserviceperf]

EXTRACT-Processes = PROCS \w+: (?P\d+) \w+\"

EXTRACT-Zombies = PROCS \w+: (?P\d+) \w+ with STATE = Z

But our Nagios service_description for these are "Total Processes" and "Zombie Processes". I read in a separate answer that you mentioned future releases will be CIM compliant. Are there existing standards for these names? If so and I change the service_description on the Nagios side to these CIM compliant names, will it break the historical data from that point on?

Thanks,
Matt

0 Karma

jbaileyicw
New Member

Hi Luke,

I am using the Nagios Linux Performance Graphs dashboard, and will the check the plugin for its perfdata output and see if it matches yours. Thanks.

0 Karma

lukeh
Contributor

Please ensure that you rename the "name" value in the dashboard to the relevant name of your nagios plugin, ie. it should be the same as your service_description in nagios.

Example plugin output from "check_mem.pl" is as follows:

TOTAL=12167908KB;;;; USED=2661344KB;;;; FREE=9506564KB;;;; CACHES=9207732KB;;;;

Note: the Splunk for Nagios compatible Memory plugin is available here:

check_mem.pl: http://exchange.nagios.org/directory/Plugins/System-Metrics/Memory/check_mem-2Epl/details

All the best,

Luke 🙂

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...