Getting Data In

Is splunk supported forwarding log data from an IBM GPFS mount point?

sseekamp
Explorer

We are running a small GPFS cluster on AIX. I am seeing high CPU usage running a universal forwarder pointed at log files on the GPFS mount point.

1 Solution

dwaddle
SplunkTrust
SplunkTrust

Is Splunk itself running on the GPFS, or is it on a JFS/JFS2 and simply reading files from GPFS? Splunk has specific requirements for filesystem types related to its own data (index) storage - but I don't think there is a specific support policy about what filesystems Splunk can monitor.

I can see where the splunk filesystem monitor functionality could have a negative impact on GPFS. There is a high volume of stat(2) system calls, and (by default) it will recurse through the directory structure. Depending on how big your GPFS is, and from what level on the tree you have Splunk configured to monitor - the number of stat(2) calls could be substantial. And, of course, stat(2) is a filesystem metadata operation - which on GPFS could require additional processing like communicating with the other GPFS servers to get updated metadata.

There's mention in this developerworks document about various GPFS tuning options. There appears (on first sight) to be more than one that could have an impact on Splunk's interaction with GPFS.

My advice would be to be sure of just how much of the GPFS you're trying to monitor with Splunk and try to get IBM GPFS support to help with tuning advice. Their defaults may not be appropriate for software like Splunk.

View solution in original post

dwaddle
SplunkTrust
SplunkTrust

Is Splunk itself running on the GPFS, or is it on a JFS/JFS2 and simply reading files from GPFS? Splunk has specific requirements for filesystem types related to its own data (index) storage - but I don't think there is a specific support policy about what filesystems Splunk can monitor.

I can see where the splunk filesystem monitor functionality could have a negative impact on GPFS. There is a high volume of stat(2) system calls, and (by default) it will recurse through the directory structure. Depending on how big your GPFS is, and from what level on the tree you have Splunk configured to monitor - the number of stat(2) calls could be substantial. And, of course, stat(2) is a filesystem metadata operation - which on GPFS could require additional processing like communicating with the other GPFS servers to get updated metadata.

There's mention in this developerworks document about various GPFS tuning options. There appears (on first sight) to be more than one that could have an impact on Splunk's interaction with GPFS.

My advice would be to be sure of just how much of the GPFS you're trying to monitor with Splunk and try to get IBM GPFS support to help with tuning advice. Their defaults may not be appropriate for software like Splunk.

dwaddle
SplunkTrust
SplunkTrust

And as always, if the answer is useful please upvote/accept - thanks!

0 Karma

halr9000
Motivator

I just manually accepted this old answer for ya @dwaddle

dwaddle
SplunkTrust
SplunkTrust

Also, make sure you are not recursing too deeply un-necessarily. From what I understand, even if you blacklist a directory, Splunk 4.2 will still recursively readdir() and stat() down through it. It will exclude the files, but not without at least enumerating them first. Depending on how you have your monitor:// stanzas defined, they could be doing much more I/O than you had previously expected.

0 Karma

sseekamp
Explorer

Thanks dwaddle - we are running the forwarder off of jfs2 on AIX and just watching gpfs mount points. That's good to know on the stat calls. I will look into tuning that area. Good info!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...