I have a log that tracks fruit names (Ok, not really, but let's go with that) over the course many log entries comprising a session. All valid session contain exactly 1 banana and 1 orange, and may also contain pineapple, kiwi, apple and kumquats. Sessions always end with orange, but they can start with any other fruit, and all sessions last less than 15 minutes. Edit: Sadly, sessions lack a session ID, so I'm using Transaction to infer when sessions start and end.
Occasionally there are invalid sessions (Rotten Fruit) which are indicated by more than 1 banana in the same transaction as an Orange.
How can I write a transaction to kick out transactions like this:
v.fruit_session=1 fruit=kumquat
v.fruit_session=1 fruit=banana
v.fruit_session=1 fruit=pineapple
v.fruit_session=1 fruit=banana
v.fruit_session=1 fruit=orange
but keep transactions like this?
v.fruit_session=2 fruit=kumquat
v.fruit_session=2 fruit=banana
v.fruit_session=2 fruit=pineapplev.fruit_session=2 fruit=orange
v.fruit_session=3 fruit=banana
v.fruit_session=3 fruit=kumquat
v.fruit_session=3 fruit=pineapple
v.fruit_session=3 fruit=orange
Here's what I'm using today:
index=prod-fruit | transaction v.fruit_session endswith=v.fruit=orange maxspan=15m unifyends=true | table v.fruit_session_id, duration, _time, eventcount
I think the endswith makes it so I always end on Orange, and the maxspan15 min keeps my events from growing too large, and unifyends makes it so I don't have orphans. But I think I could still get double bananas.
OK, because you do not have a sessionID, let's expand the answer and make one:
| makeresults
| eval fruit="kumquat,banana,pineapple,banana,orange::kumquat,banana,pineapple,orange::banana,kumquat,pineapple,orange"
| makemv delim="::" fruit
| mvexpand fruit
| streamstats count AS _count
| eval _time = _time - (1000 * _count)
| makemv delim="," fruit
| mvexpand fruit
| streamstats count AS _count2 BY _count
| eval _time = _time + (60 * _count2)
| sort 0 - _time
| rename COMMENT AS "Everything above generates sample events; everyting below is your solution"
| streamstats count(eval(fruit="orange")) AS fruit_session
Now you have a session ID and have options; you can either finish like this:
| stats list(fruit) AS fruit BY fruit_session
| eval banana_count = mvcount(mvfilter(like(fruit, "banana")))
| where banana_count!=1
Or maybe you prefer like this (to keep all the raw events):
| eventstats count(eval(fruit="banana")) AS banana_count BY fruit_session
| where banana_count!=1
For the Eval statement - won't I have to canonically list all combinations of fruit combinations (ending in orange) to make this work? That seems prohibitive.
What I'd prefer to do is something like this (please excuse my horrible pseudo basic)
For all log entries containing FRUIT
IF Fruit=Orange then
Move to next fruit
Create new Fruit_session
Until Fruit=Orange (Add Fruit to Session)
END IF
Now I can look find all sessions where the Banana count >1 and ignore those.
That is exactly what mine does. It groups fruit together by breaking the groups at orange
, then counts how many bananas are in the group and keeps only those groups with exactly 1 banana. It works, just try it.
First of all, get rid of transaction
; try this:
| makeresults
| eval fruit="kumquat,banana,pineapple,banana,orange::kumquat,banana,pineapple,orange::banana,kumquat,pineapple,orange"
| makemv delim="::" fruit
| mvexpand fruit
| streamstats count AS v.fruit_session
| eval _time = _time - (1000 * 'v.fruit_session')
| makemv delim="," fruit
| mvexpand fruit
| streamstats count AS _count BY v.fruit_session
| eval _time = _time + (60 * _count)
| sort 0 - _time
| rename COMMENT AS "Everything above generates sample events; everyting below is your solution"
| stats list(fruit) AS fruit BY v.fruit_session
| eval banana_count = mvcount(mvfilter(like(fruit, "banana")))
| where banana_count!=1
Interesting! It looks like you're suggesting creating a field fruit_session based on all possible combinations of fruit orderings, am I reading that right? This seems prohibitive since Fruit can be repeated in my situation (like "Kumquat,Pineapple,Kumquat,Pineapple,Kumquat,Pineapple,Banana,Orange"), but still an interesting direction.
No, I am assuming that v.fruit_session already exists. My creation if it was merely to fake what I am assuming you already have for free in your events.
Do you really have v.fruit_session
? If so, then why in the world are you using transaction
?
Logically there is a fruit session, but it's not encoded in the logs. So, I have to infer the session based on the log content.