All,
I am trying to tune performance on a set of data. Basically I have narrowed it down to search time extractions being the issue but I really don’t see any resource limits on the search heads that indicate that it’s working all that hard on the data set. That is, low CPU ~2% and yet I am waiting minutes for fields to extract. Any recommendation on how I might get more performance from this search?
Here are my notes -
My search
index=akamai over 1 hour
Fast Mode I get - slow but workable
--9,525,499 events in 39.804 seconds
However, this set of data normally needs the fields. So in smart mode
--9,571,210 events in 243.244 seconds
This means we're talking 4 minutes an hour, which not an acceptable performance for our user. As it's "smart mode" this implies it's a search tier issue. Correct? I went ahead and setup batch mode searh parallelization and here are my new results.
Fast mode
--9,440,668 events in 17.909 seconds
Smart Mode
--9,453,800 events in 134.911 second
While this improvement is great. We're still looking at Smart mode being over 2 minutes per hour of data. We continue to need raw data searches to be more performant. Search time extractions are just taking too long. I was wondering if there are any ways to tell what extraction is taking too long? The data is well formed, cooked JSON data coming from a heavy forwarder, which pull data down from Akamai. Perhaps I need to convert some fields to index time? There are common ones like "site" which are usually used.
So I went one step farther end upped it to 3 pipelines.
Fast Mode
--9,510,314 events in 15.77 seconds
Smart Mode
--9,498,542 events in 130.875 seconds
Over all, we're still exceeding 2 minutes per hour of data search time for Akamai with extractions. I'd like to get that down closer to 1 minute, per hour of data. That is asking our user to wait 24 minutes for just a day of data. So some reading shows I might want to try INDEXED_EXTRACTIONS = json, so I applied that to my heavy forwarder. I went ahead and applied that to one of my two heavy forwarders that process the akamai data and let it bake for an hour. Fast Mode time went up to 17 seconds and Search Mode time went up to 160 seconds
With the clear decrease in performance over all from indexed time extractions I went ahead and disabled that immediately. I am using the Splunk Add-on for Akamai from Splunkbase here for props.conf.
https://splunkbase.splunk.com/app/3030/
Overall I am not seeing much in terms of CPU usage on the Search head. 1.9% - 3% CPU during the search. So I am not sure how to get the field extraction process on the search head to use all the idle resources.
... View more