I am trying to create a search to return the source name for applications that have not been restarted in the last 30 days. A restart may be seen in the .out logs of an application by the phrase "Starting WebLogic Server with Oracle". How do I return sources that do not contain this phrase in the past 30 days?
This is the simple search I am using to return the applications that do have restarts associated with them, although i can't figure out how to make it return sources with a zero count.
source=*out "Starting WebLogic Server with Oracle" | stats count by source
In essence, this gets the most recent event from each source
with that verbiage, calculates that event's age, and keeps only those sources where the age is greater than 30 days. ( dedup
is going to be cheaper than stats
, since you don't care how many times the restart has occurred.)
source=*out "Starting WebLogic Server with Oracle"
| dedup source
| eval age=floor((now()-_time)/86400)
| where age>30
On the other hand, this is probably a faster choice. Use tstats
to identify all the potential sources, then scramble that together with the first record from each source in the last 30 days. If there's a 0 record and no 1 record, then you have a non-restarted source.
earliest=-30d source=*out "Starting WebLogic Server with Oracle"
| dedup source
| table source
| eval count=1
| append
[| tstats count WHERE index=* OR index=_* by source
| table source
| where like(source,"%out")
| eval count=0
]
| stats max(count) as count by source
| where (count == 0)
Updated "like" to use %
Splunk won't find something that doesn't exist. If an application has not logged a restart then it won't show up in a search.
You need a list of all expected applications in a lookup. Then subtract the applications in your search from the expected list and you have those that have not restarted.
See https://answers.splunk.com/answers/406103/how-to-create-a-search-to-find-expected-hosts-that.html
In essence, this gets the most recent event from each source
with that verbiage, calculates that event's age, and keeps only those sources where the age is greater than 30 days. ( dedup
is going to be cheaper than stats
, since you don't care how many times the restart has occurred.)
source=*out "Starting WebLogic Server with Oracle"
| dedup source
| eval age=floor((now()-_time)/86400)
| where age>30
On the other hand, this is probably a faster choice. Use tstats
to identify all the potential sources, then scramble that together with the first record from each source in the last 30 days. If there's a 0 record and no 1 record, then you have a non-restarted source.
earliest=-30d source=*out "Starting WebLogic Server with Oracle"
| dedup source
| table source
| eval count=1
| append
[| tstats count WHERE index=* OR index=_* by source
| table source
| where like(source,"%out")
| eval count=0
]
| stats max(count) as count by source
| where (count == 0)
Updated "like" to use %
I like this solution way faster than what i came up with while waiting, i have never used tstats before and will need to do some looking into it. Also updated the * to a % in the where like(...) command to get that section to work. Thanks! Very helpful, my final search is below after index was specified.
earliest=-30d index=wls_prd source=*out "Starting WebLogic Server with Oracle"
| dedup source
| table source
| eval count=1
| append
[| tstats count WHERE index=wls_prd by source
| table source
| where like(source,"%out")
| eval count=0
]
| stats max(count) as count by source
| where (count == 0)
Nice, thanks for posting your solution. Fixed the code. Yeah, I always have to check my wheres for capital letters on OR and AND, double == on equals, and % versus *. I know too darn many languages...