Dashboards & Visualizations

IS it possible to filter parts of XML events before indexing?

kenferguson
New Member

I'm pretty sure this isn't possible from reading around, but want to check with some experts. I've looked around the Splunk site but the only similar question I found was an ancient unanswered one about a non-XML feed.

I've got an incoming stream of data in XML form where each packet is pretty large. I'd obviously like to only index the data that I actually need, otherwise the cost of the Splunk license becomes prohibitive, so can anyone tell me if there's a way of processing the data before it gets indexed and filtering out a subset of the XML?

I had a look at Heavy Forwarders, but I think they're only useful if I was looking to filter events in their entirely - I want to keep all the events, but throw away part of the data.

Example input:

<xmlblob>
 <comment>Lorem ipsum dolor sit amet, consectetur adipiscing elit
   ...
 </comment>
 <interestingdata>55</interestingdata>
</xmlblob>

which I'd want to convert to

<xmlblob>
 <interestingdata>55</interestingdata>
</xmlblob>

Thanks in advance for any help or thoughts!

0 Karma

jkat54
SplunkTrust
SplunkTrust

For this I would use SEDCMD-<class> in my props.conf.

For example, props.conf on the Forwarders & Indexers:

[myXMLDataSourceType]
...
SHOULD_LINEMERGE = True
SEDCMD-aaa_removesComments = s/\s+<comment>.*<\/comment>//g

I put aaa on the class because the SEDCMDs are applied in ascii order. So as you're modifying the data you probably want to do so in a very specific sequence.

0 Karma

kenferguson
New Member

Thanks - I haven't had a chance to verify because I've been pulled on to another part of the project. Will take a look when I have a moment.

Thanks!

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...