So here's my workflow.
My problem is that I'm searching dense data over 180 days, and it takes a few hours to run a search, AND the values in the .csv change weekly.
Currently I'm attempting to use a data model, though data model's don't allow | where
filters.
The search I have is pretty easy:
index=network sourcetype=fgt_traffic
| where dest_ip==match_address
| stats count by match_address
All I'm looking for is the number of hits, and the address that is matching, but I need to be able to be able to get results much faster than a few hours.
I mentioned I'm trying to use a data model, but I was wondering if tstats/tscollect would be a better way to summarize this information, or maybe a KV Store store somehow?
i'm thinking I need a summary index of just dest_ip's per event within my index, so that I can quickly do a comparison to a lookup file that will change weekly. Any thoughts would be helpful.
Thank you!
Did you try the TERM() keyword? The magic behind it is described in this session for example: Indexed Tokens and you
I was able to greatly improve searches for single IP addresses with TERM(). It's working even better when the logs you are searching in do not have quotes (") around the value of dest_ip and have field=value in the _raw data. Because you can then do something like TERM(dest_ip=192.168.1.1).
You can even use the inputlookup @masonmorales suggested, when you know how to format the output of a subsearch:
index=network sourcetype=fgt_traffic [|inputlookup match_address
| eval search="TERM(".match_address.") OR "
| stats values(search) as search
| nomv search
| eval search="(".rtrim(search, " OR ").")"]
Edit: Insert " OR " between terms.
Maybe something like this?
index=network sourcetype=fgt_traffic [|inputlookup match_address | rename match_address as dest_ip]
| stats count by dest_ip
Using an inputlookup as a subsearch creates a boolean AND between the base search and the contents of the subsearch. The key is to match the field names.
this will give me the results, I agree, though it's not just the results i'm looking for, it's the 3 hour timespan it takes to get results from 180 days I'm attempting to effect.
Is match_address just one IP?
match address will be a series of IP address in the lookup file.
The lookup file looks like this:
match_address
10.10.10.1
10.43.56.22
157.46.8.2
etc..
Since, the data model searches doesn't allow filters and your lookup changes weekly, I would setup a summary index to calculate the count more frequently (hour/day) and then use the summary index for your report.