Splunk Search

Is there a way to speed up my search through millions windows security logs?

packet_hunter
Contributor

Doing some long-tail analysis and I am running in Fast Mode but the query for 24 hours is taking a long time.

Please let me know if there is a way to speed this up. Not that familiar with TSTATS but if there is an option, please let me know.

index=wineventlog sourcetype="WinEventLog" |stats count by EventCode TaskCategory | where count<10 |sort count

Thanks

Tags (2)
0 Karma
1 Solution

puneethgowda
Communicator

We have done the following things after doing R & D.

1.Changed date range from real time to today.
2.Set dashboard refresh time to every 5 minutes.
3.Summary indexing
4.Report acceleration
5.Scheduled this search every 5 minutes so it will save in the cache.
6.Search query optimization.
7.Auto restart splunk daily at 2:00 AM UTC so that memory will be released.
8.Set high priority to this dashboard.
7.Set high priority to this scheduled search.
8.Run stats tables first then start charts.
9.Changed the delimer of raw data from text files method to new way which will reduce the time while converting raw data to fields of delimiting proccess.
10.Reduce the number of indexes and source type

After all this my dashboards loading time reduced from 3 minutes to less than 10 seconds.

Super fast

View solution in original post

0 Karma

puneethgowda
Communicator

We have done the following things after doing R & D.

1.Changed date range from real time to today.
2.Set dashboard refresh time to every 5 minutes.
3.Summary indexing
4.Report acceleration
5.Scheduled this search every 5 minutes so it will save in the cache.
6.Search query optimization.
7.Auto restart splunk daily at 2:00 AM UTC so that memory will be released.
8.Set high priority to this dashboard.
7.Set high priority to this scheduled search.
8.Run stats tables first then start charts.
9.Changed the delimer of raw data from text files method to new way which will reduce the time while converting raw data to fields of delimiting proccess.
10.Reduce the number of indexes and source type

After all this my dashboards loading time reduced from 3 minutes to less than 10 seconds.

Super fast

0 Karma

puneethgowda
Communicator

We have done the following things after doing R & D.

1.Changed date range from real time to today.
2.Set dashboard refresh time to every 5 minutes.
3.Summary indexing
4.Report acceleration 
5.Scheduled this search every 5 minutes so it will save in the cache.
6.Search query optimization.
7.Auto restart splunk daily at 2:00 AM UTC  so that memory will be released.
8.Set high priority to this dashboard.
7.Set high priority to this scheduled search.
8.Run stats tables first then start charts.
9.Changed the delimer of raw data from text files method to new way which will reduce the time while converting raw data to fields of delimiting proccess.
10.Reduce the number of indexes and source type

puneethgowda
Communicator

After all this my dashboards loading time reduced from 3 minutes to less than 10 seconds.

Super fast

0 Karma

puneethgowda
Communicator

You need to sit for long hours to implement all 10 steps but worth doing it.

0 Karma

packet_hunter
Contributor

Thank you for these steps. I will look into it.

0 Karma

DalJeanis
Legend

Hmmm. Interesting. If the two fields you care about are extracted at index time, then use tstats. Other than that, it's a matter of learning how to finesse your data.

I don't know your data, but there have to be certain combinations of EventCode and TaskCategory that make up the bulk of your data. If you throw out those common transactions, then the rare transactions should stand out better, and the rest of the calculations should be much faster.

What you could do is, run your search for, say, a fifteen minute period, select all those common combinations that are found to have more than ten examples, and add those to a lookup table.

The second search would format that input lookup table as

(EventCode=X1 AND NOT(TaskCategory=Y11 OR TaskCategory=Y12  OR TaskCategory=Y13) OR 
(EventCode=X2 AND NOT(TaskCategory=Y21 OR TaskCategory=Y22  OR TaskCategory=Y23  OR TaskCategory=Y23) OR 

and so on. EventCodes that are rare in themselves would probably get a different search (off the top of my head).

0 Karma

skoelpin
SplunkTrust
SplunkTrust

Another possibility could be that @packet_hunter is putting all his data into the same index, so he's having to sort through 100M+ events to pick out the sec logs. We need more info on his setup before making a good recommendation

0 Karma

packet_hunter
Contributor

Thank you both for your comments.
Yes in a 24hr period I have about 200 million security events.
Now -dumping the highest count and looking for the specific event codes of interest is an option I have considered, but there are times we need to see the totals of all codes within that day. Like if 10,000 4624 or 4625 from a specific security_id in an hour, then there is something worth looking at...

FYI, the index is dedicated but does contain application events, system events, as well as Security events. The majority of events are Security events. I probably would have set this up differently but I have to deal with what is in place.

@DalJeanis would you mind sharing a tstats example I might use for this scenario, I will concurrently try my hand at the syntax

I see counts of events with

|tstats count  where index=wineventlog by sourcetype

Just having trouble grabbing the EventCode field values....

Thank you

0 Karma

skoelpin
SplunkTrust
SplunkTrust

You could either add more indexers or use a summary index to spread the search cost across time.

How many events do you have in a 24 hour period? What's your Splunk setup look like? Are your indexer(s) on physical blades or VM's?

Lots of people will virtuallize indexers which can get ugly. Splunk is all about IOPs

0 Karma

packet_hunter
Contributor

Thank you for the suggestion, yes we are adding more indexers, and looking into summary indexes too

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...