I'm going crazy trying to figure this out. Splunk is not my primary job function, so I am no good at time manipulation.
I've been messing with eventstats and streamstats, but can't get it. I think streamstats is what I want.
Here's my problem.
I have thousands of units out in the field, each reporting multiple fields of data every hour (same fields per unit, just updated data every hour for health monitoring). For each unit reporting, I need to find days where a certain field was the same at the beginning of the day and at the end. Then I need to do this for every day over the past few weeks and sum the number of times each unit had this occurrence happen. I can't even get the search to work for one day, much less summarize them.
I instantly return results for every unit, when I know that in my entire fleet of units I should only see a few.
The fields of interest:
serial number (SN)
number of connections (conn)
index=myindex class=polling | bin _time span=1d | eventstats min(conn) as minCONN max(conn) as maxCONN by _time SN | eval deltaCONN=maxCONN-minCONN |where deltaCONN=0|table _time mi deltaCONN
Can anyone help me with this? I am very stuck I think the hardest part for me is getting the time divisions right. Ideally I would only want to look at the field data from 9AM - 5PM each day - then I want to do this every day for every unit.
Try this:
index=myindex class=polling | date_hourmin=strptime(_time, "%H%M") | search date_hourmin>=900 AND date_hourmin<1700 | bin _time span=1d | stats min(conn) AS minCONN max(conn) AS maxCONN BY _time SN | eval deltaCONN=maxCONN-minCONN |where deltaCONN=0|table _time deltaCONN SN
And then add on:
| stats count AS daysNoChange BY SN
The "date_hourmin" section errored for me as an invalid command, so I removed the pipe before and made it part of the search. This returns no results for me which is hard to believe. It seems I either get no results or all results haha. I appreciate your help. Perhaps I need to figure out my time division problem.
our data is in epoch/UTC time, and I am in CST. One of my concerns is that when binning data by day, I am grabbing part of 2 different days, rather than a perfect midnight to midnight or 3AM-3AM. This would 100% ruin my chance of finding the few units I am actually after, as the symptom I am seeking is connections 1 day, no connections the next, then connections again on day 3. If I catch part of day 1 and 2, I will see a delta in connections, and even though none of them were in day 2, it would look like there were.
Can you provide some of that awesome data you speak of?
🙂
Trust me I would love to, but I can't share 😞
its so easy to anonymize data...
I just wrote a script to generate random arcsight events and it took about 15 minutes...
You dont even need a script... just put random values in that make sense.