Hey all, we are having a bit of trouble with the streamstats
command, as the title indicates. The following code returns an uninterrupted descending count in the variable "test", when the reset_before param checks for a value which never can exist ("bar"):
| from datamodel:"cdnlogic" | streamstats global=f window=2 current=t latest(_time) AS latest_time earliest(_time) AS earliest_time BY logic_uid | eval logic_time_difference = 'latest_time' - 'earliest_time' | eval logic_time_difference = if('logic_time_difference' >= 1200, "foo", 'logic_time_difference') | streamstats global=f window=0 current=t reset_before=("logic_time_difference == \"bar\"") count AS test BY logic_uid | table logic_time_difference,test
returning:
diff test
0 1
961 2
5 3
418 4
407 5
20 6
23 7
1 8
foo 9
1 10
1 11
8 12
1 13
3 14
3 15
1 16
3 17
1 18
6 19
1 20
However, when reset_before is changed to look for "foo" instead of "bar", it resets the count incorrectly twice in the dataset I am testing with:
| from datamodel:"cdnlogic" | streamstats global=f window=2 current=t latest(_time) AS latest_time earliest(_time) AS earliest_time BY logic_uid | eval logic_time_difference = 'latest_time' - 'earliest_time' | eval logic_time_difference = if('logic_time_difference' >= 1200, "foo", 'logic_time_difference') | streamstats global=f window=0 current=t reset_before=("logic_time_difference == \"foo\"") count AS test BY logic_uid | table logic_time_difference,test
returns:
diff test
0 1
961 1 <-- ?
5 2
418 3
407 1 <-- ?
20 2
23 3
1 4
foo 1 <-- good
1 2
1 3
8 4
1 5
3 6
3 7
1 8
3 9
1 10
6 11
1 12
Any guidance on this issue, as well as consolidating the statement, is welcome.
It seems due to logic_uid it getting reset abruptly, so try streamstats without BY clause
| from datamodel:"cdnlogic" | streamstats global=f window=2 current=t latest(_time) AS latest_time earliest(_time) AS earliest_time BY logic_uid | eval logic_time_difference = 'latest_time' - 'earliest_time' | eval logic_time_difference = if('logic_time_difference' >= 1200, "foo", 'logic_time_difference') | streamstats global=f window=0 current=t reset_before=("logic_time_difference == \"foo\"") count AS test | table logic_time_difference,test
Unfortunately, that causes it to count the instances of ALL "logic_uid"'s together:
diff test
0 3
961 13
5 14
418 22
407 4
20 5
23 6
1 7
foo 1
1 2
1 3
8 4
1 5
3 6
3 7
1 8
3 10
1 11
6 12
1 13
Ideally, a single streamstats statement would make the most sense. Sadly, we haven't been able to figure out a way.
have a look at this answer https://answers.splunk.com/answers/516142/can-streamstats-reset-before-or-reset-after-be-use.html
Okay, that answer seems specific to counts. The actual logic we are attempting to accomplish against the dataset is not as simple as a count. In order to make the question easier to illustrate, I changed the more complex (and assumed irrelevant) logic to a count so the table output would make sense. In actuality, we are taking a sum('logic_time_difference') AS logic_time_sum
up to the reset point, AND taking the latest('logic_nid') AS logic_nid
, replacing the existing 'logic_nid' field of the events inside the series.
We did take a look at that before posting this. I'll personally take a closer look at it now/tomorrow and mark as answer if that's the key. Thanks!