Splunk Dev

python script as unarchive_cmd in props.conf?

robinson
Path Finder

Hi, I've been strumming the documentation and looking through the answers site, so far unable to come up with a solution to the topic problem. Appreciate any advice!

Working with archived data from remote systems that include output of unix/linux style "iptables -L" command. We want to search the info according to ACCEPTs, src addresses, etc.

Individual lines in the data don't have date/time info or "chain" names, so I wrote a python script that reads stdin and outputs lines with date/time and series of name=value pairs. I hoped to get this working from props.conf with a stanza that looks roughly thus:

[source::.../iptables-log*]
sourcetype = iptables-trafficlog
[iptables-trafficlog]
invalid_cause = archive
unarchive_cmd = python interpret-iptables-eventlog.py

That didn't seem to work much 😞 My hypothesis right now is that input processing isn't finding either the python interpreter or my script. My questions are (1) is what I'm attempting supposed to work? and (2) Where do I deploy my script and how specify its invocation within props.conf? (3) Is there a much simpler or obvious solution that I've overlooked?

thanks so much for your time and attention! --A Newbie

Tags (3)

tibevilaqua
Engager

It's necessary to declare some info into the inputs.conf file too.
Example: https://answers.splunk.com/answers/143771/whats-the-trick-to-get-unarchive-cmd-to-work-for-a-custom-...

0 Karma

jjensenyahoo
Explorer

I think you may need to use priority to override default unarchiving processors. See my answer if it may be helpful...

0 Karma

jjensenyahoo
Explorer

I struggled with this way too long also.
I have a custom access log format that is gzipped. It needs to be gunziped then piped through a custom converter to get to NCSA format (access_combined). No matter what I seemed to do, my log format would seem to get unarchived, but never passed through my converter (even though it seemed to be honoring my source:: spec

i had to do this in local/props.conf:

[source::/path/to/my/special/logs/.../*]
unarchive_cmd = gunzip | my_custom_converter
unarchive_sourcetype = access_combined
NO_BINARY_CHECK = true
priority = 10

The key here seems to be that I had to use priority keyword. I believe this was necessary to override the default gzip unarchiver which seemed to take precedence over whatever custom sourcetype I defined.
I'm sure there is a way to see how this is getting parsed and processed, but it's not really obvious. Full disclosure: I am a complete splunk newbie.

@lowell:
I believe they did have .gz extensions.

0 Karma

Lowell
Super Champion

jjensenyahoo, I assume your log files ended in with .gz, is that correct?

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

I think you may need to specify the command under the [source::] stanza rather than the sourcetype stanza.

John_Mark
Splunk Employee
Splunk Employee

I don't see much in the way of debugging output here. How do you know it isn't working? What warnings/errors are you seeing? Did you enable the script in your management console? Did you put it in etc/apps/search/bin ?

0 Karma

robinson
Path Finder

I didn't enable the script in management console. Where would that be done?

I fully specified the path name. Does it need to be in etc/apps/search/bin to be invoked?

0 Karma

robinson
Path Finder

Thanks for your attention John.

I know it isn't working because (1) data don't get indexed and (2) I put a line in my script to write a line of diagnostic to a fully-qualified path when the script is invoked, which never appears.

I'm not getting warnings or errors, indexing works as expected but the data in sources identified as iptables-log* in props.conf are ignored. So that suggests the "invalid_cause" spec is working. Formerly the data in question had been indexed into a large multi-line event (unusable).

I didn't enable the script in management console? Where would that be done?

0 Karma

robinson
Path Finder

Hello everybody! Actually, I'm not clear whether anyone but me has looked at this question. Is anybody out there?

Intuition would suggest the problem of invoking a little bespoke preprocessing on data at input time would be a very common thing for managers of real system deployments to want. So, it's hard for me to believe that there isn't some sort of "standard" answer to the problem I've posed. But about two weeks after posting this I've seen no response at all. Is my situation so unusual?

thanks!

robinson
Path Finder

so sorry about the terrible formatting of the code in that comment :-(. it's just two lines naming the source (file name) and assigning sourcetype.

When that's in my props.conf, I am able to search for the relevant sourcetype, but the result that comes back is one very big event containing the entire (hundred-line plus) listing from iptables -L. Not what I was hoping to get. Any suggestions on an approach are welcome!

robinson
Path Finder

Since posting this query I've had the chance to try a number of variations on the setup: "wrapping" the python command in an executable script, prepending full path specs to ensure the files can be found, and so on. The result is no joy: It appears that the "unarchive_cmd" specified is never activated. So it suggests maybe I'm taking the wrong approach. I "know" that my monitored data files contain (within some layers of zip/Z/tgz and so on) the iptables-log* contents because this configuration:

[source::.../iptables-log*]
sourcetype = iptables-trafficlog

results in many records

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...