Splunk Search

Can I check that a transaction does not contain more than 1 arbitrary field?

dreeck
Path Finder

I have a log that tracks fruit names (Ok, not really, but let's go with that) over the course many log entries comprising a session. All valid session contain exactly 1 banana and 1 orange, and may also contain pineapple, kiwi, apple and kumquats. Sessions always end with orange, but they can start with any other fruit, and all sessions last less than 15 minutes. Edit: Sadly, sessions lack a session ID, so I'm using Transaction to infer when sessions start and end.

Occasionally there are invalid sessions (Rotten Fruit) which are indicated by more than 1 banana in the same transaction as an Orange.

How can I write a transaction to kick out transactions like this:

v.fruit_session=1 fruit=kumquat
v.fruit_session=1 fruit=banana
v.fruit_session=1 fruit=pineapple
v.fruit_session=1 fruit=banana
v.fruit_session=1 fruit=orange

but keep transactions like this?

v.fruit_session=2 fruit=kumquat
v.fruit_session=2 fruit=banana
v.fruit_session=2 fruit=pineapple

v.fruit_session=2 fruit=orange

v.fruit_session=3 fruit=banana
v.fruit_session=3 fruit=kumquat
v.fruit_session=3 fruit=pineapple
v.fruit_session=3 fruit=orange

Here's what I'm using today:
index=prod-fruit | transaction v.fruit_session endswith=v.fruit=orange maxspan=15m unifyends=true | table v.fruit_session_id, duration, _time, eventcount

I think the endswith makes it so I always end on Orange, and the maxspan15 min keeps my events from growing too large, and unifyends makes it so I don't have orphans. But I think I could still get double bananas.

0 Karma

woodcock
Esteemed Legend

OK, because you do not have a sessionID, let's expand the answer and make one:

| makeresults
| eval fruit="kumquat,banana,pineapple,banana,orange::kumquat,banana,pineapple,orange::banana,kumquat,pineapple,orange"
| makemv delim="::" fruit
| mvexpand fruit
| streamstats count AS _count
| eval _time = _time - (1000 * _count)
| makemv delim="," fruit
| mvexpand fruit
| streamstats count AS _count2 BY _count
| eval _time = _time + (60 * _count2)
| sort 0 - _time

| rename COMMENT AS "Everything above generates sample events; everyting below is your solution"

| streamstats count(eval(fruit="orange")) AS fruit_session

Now you have a session ID and have options; you can either finish like this:

| stats list(fruit) AS fruit BY fruit_session
| eval banana_count = mvcount(mvfilter(like(fruit, "banana")))
| where banana_count!=1

Or maybe you prefer like this (to keep all the raw events):

| eventstats count(eval(fruit="banana")) AS banana_count BY fruit_session
| where banana_count!=1
0 Karma

dreeck
Path Finder

For the Eval statement - won't I have to canonically list all combinations of fruit combinations (ending in orange) to make this work? That seems prohibitive.

What I'd prefer to do is something like this (please excuse my horrible pseudo basic)
For all log entries containing FRUIT
IF Fruit=Orange then
Move to next fruit
Create new Fruit_session
Until Fruit=Orange (Add Fruit to Session)
END IF

Now I can look find all sessions where the Banana count >1 and ignore those.

0 Karma

woodcock
Esteemed Legend

That is exactly what mine does. It groups fruit together by breaking the groups at orange, then counts how many bananas are in the group and keeps only those groups with exactly 1 banana. It works, just try it.

0 Karma

woodcock
Esteemed Legend

First of all, get rid of transaction; try this:

| makeresults
| eval fruit="kumquat,banana,pineapple,banana,orange::kumquat,banana,pineapple,orange::banana,kumquat,pineapple,orange"
| makemv delim="::" fruit
| mvexpand fruit
| streamstats count AS v.fruit_session
| eval _time = _time - (1000 * 'v.fruit_session')
| makemv delim="," fruit
| mvexpand fruit
| streamstats count AS _count BY v.fruit_session
| eval _time = _time + (60 * _count)
| sort 0 - _time

| rename COMMENT AS "Everything above generates sample events; everyting below is your solution"

| stats list(fruit) AS fruit BY v.fruit_session
| eval banana_count = mvcount(mvfilter(like(fruit, "banana")))
| where banana_count!=1
0 Karma

dreeck
Path Finder

Interesting! It looks like you're suggesting creating a field fruit_session based on all possible combinations of fruit orderings, am I reading that right? This seems prohibitive since Fruit can be repeated in my situation (like "Kumquat,Pineapple,Kumquat,Pineapple,Kumquat,Pineapple,Banana,Orange"), but still an interesting direction.

0 Karma

woodcock
Esteemed Legend

No, I am assuming that v.fruit_session already exists. My creation if it was merely to fake what I am assuming you already have for free in your events.

0 Karma

woodcock
Esteemed Legend

Do you really have v.fruit_session? If so, then why in the world are you using transaction?

0 Karma

dreeck
Path Finder

Logically there is a fruit session, but it's not encoded in the logs. So, I have to infer the session based on the log content.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...