Getting Data In

How to index less data?

sc0tt
Builder

I would like to index less data into Splunk by modifying several XML sources so that I'm only including certain fields and formatting it as a key-value pairs. I believe I can do this by creating a scripted input. I've looked at documentation here but I'm still unsure if this what I need and how to implement.

Also - when using a scripted input how do you prevent duplicate data from being indexed? Does Splunk have an internal mechanism for this or do I need to include this logic in my script?

Can somebody help point me in the right direction?

0 Karma
1 Solution

MuS
Legend

Hi sc0tt

scripted inputs are one approach, another would be to use props and transforms to send unwanted data to the null queue. Have a look at the docs about filter and route for more details about this topic.

hope this helps ...

cheers, MuS

View solution in original post

MuS
Legend

Hi sc0tt

scripted inputs are one approach, another would be to use props and transforms to send unwanted data to the null queue. Have a look at the docs about filter and route for more details about this topic.

hope this helps ...

cheers, MuS

sc0tt
Builder

In the end I used the filter and route method that you referenced and used a sed script. This works perfectly. Thanks again.

0 Karma

sc0tt
Builder

After searching around Splunk Answers more I came across several posts regarding indexing XML files and field extraction. I believe this is what I need. I'm going to try to give those suggestions a shot and see if that works.

0 Karma

sc0tt
Builder

Please correct me if I'm wrong, but field extraction will just create fields at index time based on the the raw data, but it will not change the amount of data that is being indexed, correct? My goal is to transform the raw XML data into a new slimmed down Splunk friendly format. For this, I believe that creating a scripted input may be the best solution.

0 Karma

MuS
Legend

This should be possible, but this is field extraction and is handled here in the docs http://docs.splunk.com/Documentation/Splunk/6.0/Data/Aboutindexedfieldextraction

sc0tt
Builder

Thanks. Would I be able to change the data format to create new fields with the filter and routes option? For example, could I use a regular expression to filter an XML file for something like <Field>Value</Field> and create field = value? This way I get rid of a lot of extra data that I don't need and only keep a simple key-value pair?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...