Splunk Dev

Can I parse my log with a python script before indexing?

drebai
Explorer

Hi!
I'm reading the scripted input documentation but I don't understand if they can help me in what I'd like to do.
I would like to be able to save some types of different logs in the same format.
Is it possible to use a python script to receive logs and parser them?
The logs are complex and to get a unique dashboard I first have to extract all the fields for each format and use custom search command (that I created already with intersplunk to created new fields).
I would prefer to do an initial parsing in order to extract the same fields from all sources and created new fields (and saved that).

Example: (it's just a simplified example of my situation)
format1: ###EXPECTED### {"field1":"value1} ###ACTUAL### {"field1":"value2","field2":"value1"}
format2: timestamp \n exp_field: {"field1":"value1}\n act_field {"field1":"value2","field2":"value1"}

In my dashboard I would like a count of different fields between jsons.
Now I need to extract the fileds with two different regExp and then use a custom command that extracts the different fields between the two jsons.
I would like to do everything before indexing. It's possible?

Thanks,
Deb

Tags (2)
0 Karma
1 Solution

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

View solution in original post

richgalloway
SplunkTrust
SplunkTrust

Yes, it's possible. I've written a number of Python scripts that read events from a source and transform them before handing them to Splunk for indexing. In a scripted input, your script does the work of reading the source data - there is nothing to "receive". The script opens the file or makes a REST request or does something else to get its input then it does the transformation and writes the results to stdout. Whatever goes to stdout is what Splunk will index.

---
If this reply helps you, Karma would be appreciated.

drebai
Explorer

Thank you!
Are there any guides or examples?
I only find things concerning the exclusion of fields directly from the input.conf

0 Karma

Yunagi
Communicator

Here is an example:
https://docs.splunk.com/Documentation/SplunkCloud/6.6.3/AdvancedDev/ScriptExample
As @richgalloway said, whatever goes to stdout (via "print") is what Splunk will index. So add a few lines in your Python script to format the output as needed.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...