Getting Data In

How to index less data?

sc0tt
Builder

I would like to index less data into Splunk by modifying several XML sources so that I'm only including certain fields and formatting it as a key-value pairs. I believe I can do this by creating a scripted input. I've looked at documentation here but I'm still unsure if this what I need and how to implement.

Also - when using a scripted input how do you prevent duplicate data from being indexed? Does Splunk have an internal mechanism for this or do I need to include this logic in my script?

Can somebody help point me in the right direction?

0 Karma
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi sc0tt

scripted inputs are one approach, another would be to use props and transforms to send unwanted data to the null queue. Have a look at the docs about filter and route for more details about this topic.

hope this helps ...

cheers, MuS

View solution in original post

MuS
SplunkTrust
SplunkTrust

Hi sc0tt

scripted inputs are one approach, another would be to use props and transforms to send unwanted data to the null queue. Have a look at the docs about filter and route for more details about this topic.

hope this helps ...

cheers, MuS

sc0tt
Builder

In the end I used the filter and route method that you referenced and used a sed script. This works perfectly. Thanks again.

0 Karma

sc0tt
Builder

After searching around Splunk Answers more I came across several posts regarding indexing XML files and field extraction. I believe this is what I need. I'm going to try to give those suggestions a shot and see if that works.

0 Karma

sc0tt
Builder

Please correct me if I'm wrong, but field extraction will just create fields at index time based on the the raw data, but it will not change the amount of data that is being indexed, correct? My goal is to transform the raw XML data into a new slimmed down Splunk friendly format. For this, I believe that creating a scripted input may be the best solution.

0 Karma

MuS
SplunkTrust
SplunkTrust

This should be possible, but this is field extraction and is handled here in the docs http://docs.splunk.com/Documentation/Splunk/6.0/Data/Aboutindexedfieldextraction

sc0tt
Builder

Thanks. Would I be able to change the data format to create new fields with the filter and routes option? For example, could I use a regular expression to filter an XML file for something like <Field>Value</Field> and create field = value? This way I get rid of a lot of extra data that I don't need and only keep a simple key-value pair?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...