Re: some of the values are not able see when use t...

meia · ‎08-09-2018

index=** sourcetype=**** location=00000 
| bin _time span=1d 
| rex "\[Id=(?<IDValue>[^\,]*?),[\s ].*?,[\s ]percentage=(?<percentageValue>[^\,]*?),[\s ].*?,[\s ]location=(?<locationValue>[^\,]*?)," 
   max_match=0
| fields * 
| stats avg(percentageValue) AS avgpred, stdevp(percentageValue) AS lstdev , var(percentageValue) AS varpf by locationValue,IDValue, _time
| table _time locationValue,IDValue,percentageValue

But I am not able to see the percentageValue and locationValue in table

DalJeanis · ‎08-10-2018

The reason those fields do not appear is because you renamed them in the stats command.

Replace the last line with the second of these two lines...

| stats avg(percentageValue) AS avgpred, 
    stdevp(percentageValue) AS lstdev , 
    var(percentageValue) AS varpf 
    by locationValue, IDValue, _time

| table _time, locationValue, IDValue, avgpred, lstdev, varpf

This is an important fact to remember about the stats command... all individual events are consumed by the stats command, and only the listed fields exist afterwards.

Stats has two "cousins" that operate differently.

Eventstats will aggregate the data in exactly the same way as stats, but then add the appropriate result fields to every individual event. This is useful if you want to know how many stdevs above or below average an event might be.

streamstats calculates aggregate values on the data that it has already seen, in order, and adds the aggregate fields to each result as/of that point in the data stream.

Suppose this is your three result events...

 record    price   
    1       3.50
    2       6.75    
    3       9.75

Here are the results of the three commands:

| stats count as count1 sum(price) as total1

count1  total1
  3      20.00

| eventstats count as count2 sum(price) as total2

 record    price   count2   total2  
    1       3.50      3      20.00
    2       6.75      3      20.00    
    3       9.75      3      20.00

| streamstats count as count3 sum(price) as total3

 record    price   count3   total3  
    1       3.50      1       3.50
    2       6.75      2      10.25    
    3       9.75      3      20.00

adonio · ‎08-09-2018

why are you trying to table the percentageValue field? its a value for each event while you already performed statistical functions on it and bined them to a single day increment?

meia · ‎08-09-2018

for example, I want to show the original percentageValue because based on that value we can see the and the variance and standard deviation for that pf from all set of pf values

adonio · ‎08-09-2018

not exactly sure how it will work as you are bucketing many event in one day.
assuming you want to see all the values who are a component of your avg stdevp and var in each day bucket, you can try something like below:

  | gentimes start="08/01/2018:00:00:00" end="08/9/2018:10:00:00" increment=10s
    | eval _time = starttime 
    | eval percentageValue = random()%100
    | eval ID = random()%5
    | eval IDValue = "id". "-" .ID
    | eval location = random()%10
    | eval locationValue = "location". "-" .location
    | fields _time IDValue locationValue percentageValue
    | rename COMMENT as "the above generates data, below is your solution" 
    | bin _time span=1d 
    | stats values(percentageValue) as percentageValue avg(percentageValue) AS avgpred, stdevp(percentageValue) AS lstdev , var(percentageValue) AS varpf by locationValue,IDValue, _time

meia · ‎08-09-2018

       _time            id     name  score   avg stdev

2017-10-18 36 hel 51 56 32
37 Lel 62
38 my 27

I don't why I am getting avg and stdev for first row
but I am unable see result for other rows

adonio · ‎08-09-2018

what is your query?

meia · ‎08-09-2018

index=*** sourcetype=*******  
| bin _time span=1d 
| rex "\[Id=(?[^\,]*?),[\s ].*?,[\s ]score=(?[^\,]*?),[\s ].*?,[\s ]location=(?[^\,]*?)," max_match=0
| streamstats  avg(scoreValue) AS avgpred, stdevp(scoreValue) AS lstdev , var(scoreValue) AS varpf by locationValue,IdValue, _time
| eval lowBound=avgpred-lstdev
| eval difference = if(scoreValue<lowBound,1,0)
| streamstats min(lowBound) as lowerBound min(difference) as diff by locationValue,IdValue, _time
| table _time locationValue IdValue  lowerBound diff avgpred lstdev

adonio · ‎08-09-2018

@meia,
please elaborate on the results that you wish to see
did you try the search i provided?
feels like you are shifting to another direction now

meia · ‎08-09-2018

I was trying to print the id, location score and for all the ids with scores I am planning to the variance avg and lsdev.
but it is calculating for only first row

adonio · ‎08-09-2018

so now scoreValue is what used to be precentageValue earlier?
please show an example of your desired output of the search, something like this:
field1, field2, field3, .... fieldN
value1, value2, value3, .... valueN
more1, more2, more3, .... moreN
...

meia · ‎08-09-2018

   _time  IdValue  locationValue  scoreValue  lstdev  avgpred lowerbound  diff
    2017-10-18      1               001            4.5              some  values                        
    2017-10-18      2               001            1.5              some  -------                           
    2017-10-18      3               001             3.2             some  ------ 
    2017-10-18      4               001             6.1              some  ------    
    2017-10-18      5               001             7.4              some -------
    2017-10-18      6               001             8.3              some -------
    .
    .
    .

meia · ‎08-09-2018

above format I am trying to do

adonio · ‎08-09-2018

did you try the query i posted? now scoreValue is what used to be precentageValue earlier?
when you using streamstats no need to use the by clause
from the results you present, it doesnt seem like you want to consolidate /bin by time.
sorry i can understand the requirement,
to understand better how streamstats works, try the following query:

| gentimes start="08/01/2018:00:00:00" end="08/9/2018:10:00:00" increment=1h
| eval _time = starttime 
| eval percentageValue = random()%100
| eval ID = random()%5
| eval IDValue = "id". "-" .ID
| eval location = random()%10
| eval locationValue = "location". "-" .location
| fields _time IDValue locationValue percentageValue
| rename COMMENT as "the above generates data"
| sort -_time
| streamstats time_window=4 avg(percentageValue) AS avgpred, stdevp(percentageValue) AS lstdev , var(percentageValue) AS varpf

you can play with the integer in time_window=<int> to see how it effects the scores
than you can post process to your needs

hope it sets you in the right direction

meia · ‎08-09-2018

ok sure, Let try this too

meia · ‎08-09-2018

_time IdValue locationValue scoreValue lstdev avgpred lowerbound diff
2017-10-18 1 001 4.5 some values

2017-10-18 2 001 1.5 some -------

2017-10-18 3 001 3.2 some ------
2017-10-18 4 001 6.1 some ------

2017-10-18 5 001 7.4 some -------
2017-10-18 6 001 8.3 some -------
.
.
.

some of the values are not able see when use table

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases

Detecting Remote Code Executions With the Splunk Threat Research Team