Facing terribly slow search time performance. Same worked fine with Splunk 5.
Job inspector shows the following:
Execution costs
Duration (seconds) Component Invocations Input count Output count
0.078 command.fields 78 911,212 911,212
184.017 command.search 78 - 911,212
0.423 command.search.index 78 - -
0.078 command.search.calcfields 78 911,212 911,212
0.078 command.search.fieldalias 78 911,212 911,212
0 command.search.index.usec_64_512 4 - -
0 command.search.index.usec_8_64 127 - -
182.176 command.search.rawdata 78 - -
0.083 command.search.kv 78 - -
0.078 command.search.lookups 78 911,212 911,212
0.078 command.search.tags 78 911,212 911,212
0.078 command.search.typer 78 911,212 911,212
0.045 command.search.summary 78 - -
0.018 dispatch.check_disk_usage 18 - -
0.023 dispatch.createProviderQueue 1 - -
0.085 dispatch.evaluate 1 - -
0.085 dispatch.evaluate.search 1 - -
184.037 dispatch.fetch 78 - -
0.04 dispatch.preview 40 - -
0.008 dispatch.readEventsInResults 1 - -
184.023 dispatch.stream.local 78 - -
2.303 dispatch.timeline 78 - -
0.269 dispatch.writeStatus 119 - -
0.039 startup.handoff
search log is showing thousands of these lines and search never completed so, finalized the search after x minutes.
11-dd-2013 09:15:07.587 INFO DispatchThread - Generating results preview took 1 ms
11-dd-2013 09:15:10.181 INFO DispatchThread - Generating results preview took 1 ms
11-dd-2013 09:15:12.591 INFO DispatchThread - Generating results preview took 1 ms
11-dd-2013 09:15:15.216 INFO DispatchThread - Generating results preview took 1 ms
From the line
182.176 command.search.rawdata 78
The 183 seconds not totaling up to the individual times spent.
Appreciate if anyone could help / shed light on this.?
Thanks.
Added more:
My indexers are version 6 as well. I've debugged this a bit further. There is not much Disc I/O but, yes the CPU spins crazy. Actually, I let the search job run in the background. This is just 30days data of size 70GB. Each day has couple of files with records more than of size 2GB. The search job completed after 45 minutes. There could something wrong with my indexing.
Here is the completed job inspector shows:
703.171 command.prestats 23,639 295,181,393 122,367
1,725.204 command.search 23,677 - 295,181,829
1,021.454 command.search.rawdata 23,638 - -
524.957 command.search.kv 23,638 - -
23.639 command.search.lookups 23,638 295,181,393 295,181,393
23.638 command.search.calcfields 23,638 295,181,393 295,181,393
23.638 command.search.fieldalias 23,638 295,181,393 295,181,393
10.732 command.search.summary 23,677 - -
0 command.search.index.usec_1_8 26 -
search log has:
11-dd-2013 hh:33:01.347 INFO loader - Detected 24 (virtual) CPUs and 193800MB RAM
11-dd-2013 hh:33:01.347 INFO loader - Maximum number of threads (approximate): 12288
11-dd-2013 hh:33:01.347 INFO loader - Arguments are: "search" "--
id=scheduler_adminlauncher_a3_at_1383438780_5" "--maxbuckets=0" "--ttl=172800" "--maxout=500000" "--maxtime=8640000" "--lookups=1" "--reduce_freq=10" "--user=admin" "--pro" "--roles=admin:can_delete:power:user"
..
..
11-dd-2013 hh:33:01.672 INFO BatchSearch - Adding bucket:mytestindex~58~0867F605-4D1D-4B32-84CD-
BBB2474E0F1B to batch search with min seek addr offset:0 and max seek addr offset:0 and bucket has time range set:0
We upgraded to version 6 last week. We noticed the same thing. Pre upgrade things were running well. Post upgrade CPU is maxed out and performance is terrible. No other changes were made other than the version upgrade, no modified queries, no additional data to index, no new queries. Just the version upgrade and followed by a HUGE spike in CPU.
I tried something different. Basically, I've pre-loaded the indexed files into file-cache, so there is zero latency in IO of the indexed files. But, unfortunately, it did not succeed, as the indexed files are gzip compressed. gzip is good but slow while uncompressing.
Looking for options on how to disable compression during post-index operation.
For now, using summary indexing - sistats etc. We are ok with this approach.
Actually, command.search.rawdata
means that the search is spending that time in CPU uncompressing rawdata. The Job Inspector numbers indicates that this is a dense search. Are your indexers version 6 as well or are they still in 5?
Yes, those job inspector reports differ as they are from 2 different search jobs.
First one did not complete. Second did.
Just wanted to highlight something interesting. My h/w has 192GB memory. And, none of other system processes are consuming this memory. This search is being done on single box with search head and indexer on the same box. I'm just indexing 70GB data for a total of 30 Days. Avg size per day 2.GB
If I use Cassandra for 70GB of data, even for that matter memcache, I can keep entire 70GB in memory. And hence, I would get split second perf.
Pls. tips on. Splunk 6 perf?
It would help if you (1) showed what your search looks like and (2) its entire Job Inspector output. The ones you've pasted above are clearly from two different searches.
Thanks for your hint on where the time is spent. The indexing is also on version 6.
If you haven't already I highly recommend you to download the Splunk on Splunk app (https://apps.splunk.com/app/748/ ), it gives you lots of visibility into the health of your Splunk setup.
Uh well, first of all it would be nice to know the actual search you're running?
The search string is just of the form -> index=myIndex
thats it.
This myIndex has 70GB of data indexed as per normal practices. No errors etc while indexing. You may be right ..as the time spent is not being shown any of the other activities of searching.. it may be just waiting to read or write etc..
I'm trying sistats etc. to see if that can help. Amazed that we see this after upgrade 😞
I might be missing something in the UI that changed significantly in ver 6.
command.search.rawdata
is what is spent getting the raw text of the identified events, so if this takes a long time that could imply I/O performance issues.
Uh well, first of all it would be nice to know the actual search you're running?