All Apps and Add-ons

Splunk TA for Solaris 11: How to get the Solaris ldoms.sh script to send data to indexer?

hartfoml
Motivator

I installed the Splunk TA for Solaris 11 in my UF (Universal Forwarder) and left the default collection from the inputs.conf

The stanza:

[script://./bin/ldoms.sh]
disabled=0
index = ia
interval=600
source=solaris:ldoms
sourcetype=solaris:ldoms

is default but no data is being collected. When I run the ldoms.sh as root, it outputs the expected results. I do not see any errors in the splunkd.log file associated with the script.

Any help in troubleshooting this issue would be great

0 Karma

cmeo
Contributor

The only other time I've seen something like this is where you have a combination of INDEXED_EXTRACTIONS (which I use) and routing as documented here:
http://docs.splunk.com/Documentation/Splunk/6.5.1/Forwarding/Routeandfilterdatad

There is unfortunately SPL-98594 about this which from what Support tell me is very difficult to fix and has low priority. It's been open for three years now but I didn't know about it until after I'd written the TA.

If you are using it in conjunction with routing, unexpected things may happen. I cannot suggest a workaround if this is the issue; we had to turn off routing. This may or may not be an option for you. The symptoms were much the same: some but not all sourcetypes being silently dropped on the floor.

If this isn't what's happening, please discuss with me further by email as suggested above.

0 Karma

cmeo
Contributor

This is the author. Have you set up forwarding and receiving? Test by

splunk list forward-server

on the forwarder.

Should show an active forward to your-receiver-host:9997 or whatever port you're using. If the script is producing the expected results, forwarding is working, and your're still not seeing anything, try running the search

index=solaris host=your-ldom-host

and see if the host you're expecting to see is there. sourcetype would be solaris:ldoms

Note that the input runs every 10 minutes by default so you can shorten the interval while debugging.

It just may be that your settings do not show or search this index by default. Please let me know if you're still having issues with it after this and I'll investigate further.

0 Karma

cmeo
Contributor

Let's take this offline until I've worked out what's going on. Please use the email in my profile.
I will post the solution/new version when sorted.

0 Karma

hartfoml
Motivator

Hi @cmeo Thank you so much for creating this TA and the App too. Since I am using an index for charging purposes I had to change the index to the group that is using the data index=ia. I did that by modifying the inputes.conf

I am seeing most of the sourcetypes and have been refining the dashbords the sourctypes that are working are:
1 solaris:cpu
2 solaris:iostat
3 solaris:df
4 solaris:ps
5 solaris:version
6 solaris:Uptime
7 solaris:protocol
8 solaris:hardware

I have run the scriptes for the other sourcetypes at the command line and they seem to output the data to the shell in the proper manner yet they don't seem to execute based on the inputs.conf interval like the above scripts do. Or for some reason they are not being sent UF or received by the HF properly.

How can i see if the scripts are firing but the data is not being sent. Or if the scripts are not firing.

Thanks so much for you help with this.

0 Karma

cmeo
Contributor

So if I understand correctly you've changed the index used from solaris to ia. The searches used in dashboards do not specify the index (this is deliberate, so you can use any index) but you do need to make sure all indexes you have used are accessible by default, and are searchable by default for the role you are using to access the events. So check permissions on all the objects.
Also check to see if maybe you missed some and there is an index called solaris with some contents. Your missing events might be there. Any errors resulting from script execution should show up in splunkd.log, search for ERROR ExecProcessor. I've kept it pretty simple, there doesn't seem to me a whole lot that could go wrong with it assuming SPLUNK_USER can execute the underlying commands in the first place. And this can also be a problem. For reasons best known to themselves, the people who wrote many of the OS commands I've used, only let root get access even to display the information, so unless splunk is running as root or you do a whole lot of messing around with ACLs it won't work properly. LDOMS is one example. Only root can run ldm at all. I am well aware that this is not an ideal situation, but I would suggest that if anyone can mess around with the scripts to do bad things, you already have a much larger problem than anything the TA might be able to cause.
If you are making enhancements to any of it, please let me know and I'll include them if you want. I wrote this App and TA because A. I think the official *nix app has lost its way and B. In trying to be cross-platform the official stuff doesn't use any of the cool things that are solaris-only, and which tell you about virtual networks and zones/ldoms, zpool, zfs etc. etc. For that matter, it's all out of date wrt modern linux distros as well, and for all the annoying but shiny new UI in the *nix app, it is still using methods 10 years old to collect it.

0 Karma

hartfoml
Motivator

I checks and splunkd is running at root. I have admin rights on the searchhead and did a search like this index=_* OR index=* sourcetype=solaris* this should get everything but only gets the stuff i mentioned. Thanks for creating the app and I will look into the permissions thing more.

0 Karma

cmeo
Contributor

I would also add that I have several other reasons for posting a root-only TA:
1. Modern data centres use automation (jumpstart, ansible, puppet, whatever) for everything so the question of user access to a solaris box should not come up.
2. I cannot even begin to guess at the policy framework in use for sites that are using the ACL mechanism. It may be that the list of required permissions might still fall foul of your policies. Not a discussion I want to engage in.
3. Matching the commands used to the required ACLs and creating the procedures to apply and test them is in itself not a lump of work I can justify the time for at the moment.
4. Running things setuid was out of the question.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...