Splunk Search

Comparing/Merging/Grouping timestamp with timespan

Bastelhoff
Path Finder

Hey there!

I have logs from two different sources in one search. One source provides a time range, while the other provides a time stamp. I am wondering how these could best be matched (with a minimum of processing power).

So one source leads to:

startTime| endTime | userId| task
12:00 | 12:47| 34 | Processing
12:10 | 13:11| 22 | Initiating
12:50 | 12:55| 34 | Cleaning
13:12 | 13:22| 22 | Processing

The other leads to

timestamp | userId | actionStatus
12:05 | 34 | Error
12:20 | 22 | Finished
12:45 | 22 | Error
13:00 | 22 | Error

Both are searched together (the unused fields of the different index are accordingly empty).

So the full table is:

|table startTime, endTime, userId, task, timestamp, userId,actionStatus

Each line of the second batch (with the actionStatus) needs the additional info regarding the task which took place on that userId and at that timeframe.

So the final result should be:

timestamp | userID | actionStatus | task
12:05 | 34 | Error | Processing
12:20 | 22 | Finished | Initiating
12:45 | 22 | Error | Initiating
13:00 | 22 | Error | Initiating

Is there any good way to enrich the data like this? The major complication is that the userId and timestamp need to be compared with the userId and timeframe of another batch. The only rather bad way which comes to mind is breaking up the time spans and timestamps into hourly or minutely brackets and then group by them, but that seems to be quite messy.

0 Karma

arjunpkishore5
Motivator

Try this

index=your_second_set
| join type=left earlier=true userID [index=your_first_set|eval _time=startTime ]
| where timestamp > startTime and timestamp | stats values(task) as task by timestamp, userID, actionStatus

0 Karma

Bastelhoff
Path Finder

Thanks. Unfortuntely the amount of data is too big for that case as an inner search will cap out at 50k lines. This search will likely result in a few hundred thousand if not million lines though.
While it would be possible to reduce that, by adding a more compact main search also as subsearch into the join subsearch to limit the total amount, this would result in a search which exceeds the subsearch time limit.

This is why my idea is to go the route of using just two main searches as in
(index=A fields=values) OR (index=B fields=values)
And then trying to merge them together.

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...