Dashboards & Visualizations

Need help choosing most performant data input format

halr9000
Motivator

I've got a system based on an XML API that will be spitting out a good amount of data (100k's of events an hour?) in XML format. We'll be using scripted inputs to retrieve the data, so the format can be changed prior to it being indexed. My question is this: how much if any work should be spent on munging the data in terms of impact that can have on search and index performance? Is there benefit to converting it to JSON, for example, or flattening it into tables or KV pairs? Or should I not bother and just do that work at search time?

halr9000
Motivator

I don't yet know how much baggage the XML will come with for a given event type. Obviously if a single event is 50% larger that's got to be a part of the equation. Let's assume for the sake of argument that the sizes are roughly similar.

0 Karma

RicoSuave
Builder

Hal, it depends on how deeply nested the xml is. I don't think there is much difference in terms of parsing xml or JSON. Though, i have seen with other customers, that having xml events with several thousand lines severely impacts search performance. I would write it out to key value pairs if it were up to me, but if the events are small it shouldn't cause too much trouble. my .02

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...