Splunk Search

How to achieve Optimized Search time in Splunk

disha
Contributor

Now its being a serious issue. I need some expert advice for this.
Scenario:
Splunk 5.0.2
Data Input : TCP
License: Splunk Enterprise 5,120 MB
on TCP we are getting events in every 15 second for
1.App usage EventID=3
2.CPU usage EventID=4
for App Usage Json events are like:

{"BoxID":"222333","EID":3,"TS":"Fri May 10 02:49:36 2013", "MU":969632}

For CPU usage json events are like:

{"BoxID":"111222","EID":4,"TS":"Fri May 10 02:16:00 2013","CPUusage":4.5}

we are collecting data from last 6 months. Now the thing is we have too much of data like (15sec X 6 months+). On UI we are displaying the charts. For that My searches are
(Since MU and CPUUsage are in bytes..I need to convert it in MB)

sourcetype="myagent" 
  | spath path="EID" output=EventID | search EventID=3
  | spath path="BoxID" output=UID
  |spath path="MU" output=mu |eval mu=(mu/1024)|eval mu=round(mu,2)|fields mu,UID
| timechart  limit=0 first(mu) by UID 

Now on UI even if I select last 24 hours ...It takes forever..!!
I am not very experienced with Splunk. Still learning according to the requirements and I believe Splunk is very powerful for processing the data..So I want to ask:
1. What solutions I can apply for quicker chart display on UI
2. Is there any way that I can cache the result of this search every few minutes. so When user select Last 24 hours..I could just fetch the result from cache with calculated fields and Display the chart. That will be very Fast.
Please suggest and show me the way.This is critical.

1 Solution

bmacias84
Champion

I must say this is tough question to answer and a big topic. Keep in mind this is my understanding. I would also recommend read Exploring Splunk SPL.

First would start by using adding KV_MODE = json in my props.conf so Splunk automatically knows its json, personal preference. After than I would look at my base search, to maximize search performance you want to be specific as possible to limit the number of result being return. Always specify Index, source and/or source type, if possible key words within your data. Low cardinality fields always result in quicker searches. Filter unnecessary fields as soon as possible. Do stats or evals after unnecessary events and fields have been discarded. Use buck or span where possible.


#base search
Index=someindex sourcetype=myagent “\”EID\”:3”| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID


Index=someindex sourcetype=myagent EID=3| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID

Also consider using Summary indexing and report acceleration. I highly recommend doing this.

Aboutsummaryindexing

  • Make Searches as specific as possbile
  • Limit time range if possible
  • Filter out unneeded fields
  • before using eval or doing calculations do as much filter as possible
  • use advanced charting not timeline
  • Turn off field discovery
  • use Summary Index for large data sets that span days or months.
  • Refrain from doing sparse or rare searches.
  • use Search Job Inspector to find where your search is taking the longest.

Hope this helps or gets you started. Don’t forget to vote and accept answers that help.

Cheers,

View solution in original post

bmacias84
Champion

I must say this is tough question to answer and a big topic. Keep in mind this is my understanding. I would also recommend read Exploring Splunk SPL.

First would start by using adding KV_MODE = json in my props.conf so Splunk automatically knows its json, personal preference. After than I would look at my base search, to maximize search performance you want to be specific as possible to limit the number of result being return. Always specify Index, source and/or source type, if possible key words within your data. Low cardinality fields always result in quicker searches. Filter unnecessary fields as soon as possible. Do stats or evals after unnecessary events and fields have been discarded. Use buck or span where possible.


#base search
Index=someindex sourcetype=myagent “\”EID\”:3”| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID


Index=someindex sourcetype=myagent EID=3| fields MU, BoxID, _time| eval mu=round((MU/1024),2)| rename BoxID as UID| timechart span=1h limit=0 first(mu) by UID

Also consider using Summary indexing and report acceleration. I highly recommend doing this.

Aboutsummaryindexing

  • Make Searches as specific as possbile
  • Limit time range if possible
  • Filter out unneeded fields
  • before using eval or doing calculations do as much filter as possible
  • use advanced charting not timeline
  • Turn off field discovery
  • use Summary Index for large data sets that span days or months.
  • Refrain from doing sparse or rare searches.
  • use Search Job Inspector to find where your search is taking the longest.

Hope this helps or gets you started. Don’t forget to vote and accept answers that help.

Cheers,

bmacias84
Champion

If you want to do calculations automagiclly consider using EVAL- with your props.conf. With regards to caching Report Acceleration and Summary Indexing are going to be the best answers, you could try building a lookup table. @disha sorry if that dont help.

disha
Contributor

Thankyou. This will help. Working on it.Can you please tell me how I can calculate mu in MB in advance and can i store it somewhere? I am asking specific about my question point#2.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...