Solved: How to achieve Optimized Search time in Splunk

disha · ‎05-09-2013

Now its being a serious issue. I need some expert advice for this.
Scenario:
Splunk 5.0.2
Data Input : TCP
License: Splunk Enterprise 5,120 MB
on TCP we are getting events in every 15 second for
1.App usage EventID=3
2.CPU usage EventID=4
for App Usage Json events are like:

{"BoxID":"222333","EID":3,"TS":"Fri May 10 02:49:36 2013", "MU":969632}

For CPU usage json events are like:

{"BoxID":"111222","EID":4,"TS":"Fri May 10 02:16:00 2013","CPUusage":4.5}

we are collecting data from last 6 months. Now the thing is we have too much of data like (15sec X 6 months+). On UI we are displaying the charts. For that My searches are
(Since MU and CPUUsage are in bytes..I need to convert it in MB)

sourcetype="myagent" 
  | spath path="EID" output=EventID | search EventID=3
  | spath path="BoxID" output=UID
  |spath path="MU" output=mu |eval mu=(mu/1024)|eval mu=round(mu,2)|fields mu,UID
| timechart  limit=0 first(mu) by UID

Now on UI even if I select last 24 hours ...It takes forever..!!
I am not very experienced with Splunk. Still learning according to the requirements and I believe Splunk is very powerful for processing the data..So I want to ask:
1. What solutions I can apply for quicker chart display on UI
2. Is there any way that I can cache the result of this search every few minutes. so When user select Last 24 hours..I could just fetch the result from cache with calculated fields and Display the chart. That will be very Fast.
Please suggest and show me the way.This is critical.

bmacias84 · ‎05-09-2013

I must say this is tough question to answer and a big topic. Keep in mind this is my understanding. I would also recommend read Exploring Splunk SPL.

First would start by using adding KV_MODE = json in my props.conf so Splunk automatically knows its json, personal preference. After than I would look at my base search, to maximize search performance you want to be specific as possible to limit the number of result being return. Always specify Index, source and/or source type, if possible key words within your data. Low cardinality fields always result in quicker searches. Filter unnecessary fields as soon as possible. Do stats or evals after unnecessary events and fields have been discarded. Use buck or span where possible.



#base search

Index=someindex sourcetype=myagent “\”EID\”:3”| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID




Index=someindex sourcetype=myagent EID=3| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID

Also consider using Summary indexing and report acceleration. I highly recommend doing this.

Aboutsummaryindexing

Make Searches as specific as possbile
Limit time range if possible
Filter out unneeded fields
before using eval or doing calculations do as much filter as possible
use advanced charting not timeline
Turn off field discovery
use Summary Index for large data sets that span days or months.
Refrain from doing sparse or rare searches.
use Search Job Inspector to find where your search is taking the longest.

Hope this helps or gets you started. Don’t forget to vote and accept answers that help.

Cheers,

View solution in original post

bmacias84 · ‎05-09-2013

I must say this is tough question to answer and a big topic. Keep in mind this is my understanding. I would also recommend read Exploring Splunk SPL.

First would start by using adding KV_MODE = json in my props.conf so Splunk automatically knows its json, personal preference. After than I would look at my base search, to maximize search performance you want to be specific as possible to limit the number of result being return. Always specify Index, source and/or source type, if possible key words within your data. Low cardinality fields always result in quicker searches. Filter unnecessary fields as soon as possible. Do stats or evals after unnecessary events and fields have been discarded. Use buck or span where possible.



#base search

Index=someindex sourcetype=myagent “\”EID\”:3”| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID




Index=someindex sourcetype=myagent EID=3| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID

Also consider using Summary indexing and report acceleration. I highly recommend doing this.

Aboutsummaryindexing

Make Searches as specific as possbile
Limit time range if possible
Filter out unneeded fields
before using eval or doing calculations do as much filter as possible
use advanced charting not timeline
Turn off field discovery
use Summary Index for large data sets that span days or months.
Refrain from doing sparse or rare searches.
use Search Job Inspector to find where your search is taking the longest.

Hope this helps or gets you started. Don’t forget to vote and accept answers that help.

Cheers,

bmacias84 · ‎05-09-2013

If you want to do calculations automagiclly consider using EVAL- with your props.conf. With regards to caching Report Acceleration and Summary Indexing are going to be the best answers, you could try building a lookup table. @disha sorry if that dont help.

disha · ‎05-09-2013

Thankyou. This will help. Working on it.Can you please tell me how I can calculate mu in MB in advance and can i store it somewhere? I am asking specific about my question point#2.

How to achieve Optimized Search time in Splunk

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life