Splunk Search

How to limit streamstats sum to a range of values?

nmaiorana
Explorer

We are running a CUSUM function where we do not want the value to run away either too high or too low (negative). Ideally we would like the bottom end of this one sided CUSUM to be 0 and the top side to be the value of the decision limit. I have tried using the technique where you accum a variable, but I can only get that to work on the low side.

Tags (3)
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Do you have any sample reference output you desire?

0 Karma

nmaiorana
Explorer

Here is what I'm getting:

READING_DATETIME SENSOR FREQUENCY AVGFREQ K H ZBAR ACCUMCUSUM CUSUM INCONTROL
8/15/2015 0:00 STAT 95 95 0.243851 2.43851 93.539915 93.539915 2.43851 FALSE
8/15/2015 0:05 STAT 87 91 0.243851 2.43851 89.539915 183.07983 2.43851 FALSE
8/15/2015 0:10 STAT 94 92 0.243851 2.43851 90.539915 273.619745 2.43851 FALSE
8/15/2015 0:15 STAT 89 91.25 0.243851 2.43851 89.789915 363.40966 2.43851 FALSE
8/15/2015 0:20 STAT 94 91.8 0.243851 2.43851 90.339915 453.749575 2.43851 FALSE
8/15/2015 0:25 STAT 92 91.833333 0.243851 2.43851 90.373248 544.122823 2.43851 FALSE
8/15/2015 0:30 STAT 89 91.428571 0.243851 2.43851 89.968486 634.091309 2.43851 FALSE
8/15/2015 0:35 STAT 84 90.5 0.243851 2.43851 89.039915 723.131224 2.43851 FALSE
8/15/2015 0:40 STAT 84 89.777778 0.243851 2.43851 88.317693 811.448917 2.43851 FALSE

Here is what I want to see:
READING_DATETIME SENSOR FREQUENCY AVGFREQ K H ZBAR ACCUMCUSUM CUSUM INCONTROL
8/15/2015 0:00 STAT 95 95 0.243851 2.43851 93.539915 2.43851 2.43851 FALSE
8/15/2015 0:05 STAT 87 91 0.243851 2.43851 89.539915 2.43851 2.43851 FALSE
8/15/2015 0:10 STAT 94 92 0.243851 2.43851 90.539915 2.43851 2.43851 FALSE
8/15/2015 0:15 STAT 89 91.25 0.243851 2.43851 89.789915 2.43851 2.43851 FALSE
8/15/2015 0:20 STAT 94 91.8 0.243851 2.43851 90.339915 2.43851 2.43851 FALSE
8/15/2015 0:25 STAT 92 91.833333 0.243851 2.43851 90.373248 2.43851 2.43851 FALSE
8/15/2015 0:30 STAT 89 91.428571 0.243851 2.43851 89.968486 2.43851 2.43851 FALSE
8/15/2015 0:35 STAT 84 90.5 0.243851 2.43851 89.039915 2.43851 2.43851 FALSE
8/15/2015 0:40 STAT 84 89.777778 0.243851 2.43851 88.317693 2.43851 2.43851 FALSE

Similar if the ZBAR was a negative value, limit the accumulation of CUSUM to 0.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Maybe you could use eval to make a second field that is the filter of your first. I'm also not sure exactly what you mean by decision limit.

| eval my_new_field = if(field > 0 OR field < decision_limit, field, "")

So the if function returns an empty string if its not within the range we specify. Then you can just use streamstats on the new field.

0 Karma

nmaiorana
Explorer

I tried that, but the variable used in the streamstats is not one you can alter.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Use comments (not answers) to reply.

What do you mean? The field that you're normally using with streamstats can definitely be altered

0 Karma

nmaiorana
Explorer

Here is what I did:
| eval ZBAR = blah blah blah | streamstats sum(ZBAR) as CUSUM |
eval CUSUM = if(CUSUM < 0, 0, if(CUSUM > DECLIMIT, DECLIMIT, CUSUM))

CUSUM never got reset, in most cases it get getting more negative or positive.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Can you try putting the eval CUSUM BEFORE and then putting it in this order:

   | eval ZBAR = blah blah
   | eval CUSUM =  if(CUSUM < 0, 0, if(CUSUM > DECLIMIT, DECLIMIT, CUSUM))
   |  streamstats sum(CUSUM) as CUSUM

?

0 Karma

nmaiorana
Explorer

Thanks. I tried:

search blah | eval CUSUM = max(0.000000, CUSUM)| eval CUSUM = min(H, CUSUM) | eval ZBAR = AVGFREQ - (CONTROLMEAN + K ) | streamstats sum(ZBAR) as CUSUM | table blah blah blah

I also tried:
search blah | eval CUSUM = if(CUSUM < 0, 0, if(CUSUM > H, H, CUSUM)) | eval ZBAR = AVGFREQ - (CONTROLMEAN + K ) | streamstats sum(ZBAR) as CUSUM | table blah

It appears as though there are 2 variables with the name CUSUM. One is scoped to the streamstats level and the other is scoped to the event level. Just my guess.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

No its because you're overwriting them in that example. They don't have scope backwards - only forwards.

So if you do

search blah
| eval CUSUM
| eval ZBAR
| streamstats sum(ZBAR) as CUSUM_SUM

then you won't write over the earlier CUSUM. In the examples you gave, you first eval CUSUM but lose it completely when you make sum(ZBAR) as CUSUM. You can verify this yourself by doing table after each part of the command to see what is in the field (we don't call them variables) CUSUM

0 Karma

nmaiorana
Explorer

This is what was suggested to me in the original answer. The problem is this way, CUSUM_SUM will go way outside the range of 0 to H. I'm starting to feel that there is no solution to this problem.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

I'm not sure what you're trying to do:

For example,

search blah | eval CUSUM = if(CUSUM < 0, 0, if(CUSUM > H, H, CUSUM)) | eval ZBAR = AVGFREQ - (CONTROLMEAN + K ) | streamstats sum(ZBAR) as CUSUM | table blah

Is totally different from what you were doing earlier - like you don't even use the eval CUSUM.

Did you try something like this? It should be pretty simple...

base search
| eval DECLIMIT = 100
| eval ZBAR = AVGFREQ - (CONTROLMEAN + K ) 
| eval ZBAR_FILTER = case(ZBAR < 0, 0,  ZBAR > DECLIMIT, DECLIMIT, 1==1, ZBAR)
| streamstats sum(ZBAR_FILTER) as CUSUM

The point I was trying to make is that you need to add the filter log to the field that you are putting into the sum() function BEFORE you actually sum up the field values.

0 Karma

nmaiorana
Explorer

The value CUSUM is used later (omitted for brevity) to determine if a function is out of control. Anything CUSUM > than H is out of control. When the value continues to increase and gets very large, even after the function gets back into control, the variation from the mean is too small to register. For this reason I need CUSUM = min(H, max(CUSUM, 0). However, since it's getting accumulated using the streamstats, there is no way to update that value to stay within the range.

Perhaps I just need to use a different method than streamstats to accumulate the ZBAR values.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Did you know that streamstats has a window option ?

window
Syntax: window=<int>
Description: The window option specifies the number of events to use when computing the statistics.
Default: 0, which means that all previous (plus current) events are used.\

that way you could make the window smaller ... | streamstats sum(foo) as sum window=10 so that smaller deviations can be noticed.

the stats functions also have standard deviation (stdev) as well as variance (var) and a few others you might be able to use ?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...