Splunk Search

How to reassemble indexed multiline events that were not split properly using either join or another search-time option?

drodman29
Path Finder

I have multiline events that were split by the default 256 line limit (MAX_EVENTS). While I have read all on how to fix the issue going forward, is there a way to use the data that I already have indexed? I have one event that has linecount =257 and some number of additional (artificially created by the default line limits) "events" with the same timestamp that I would like to join in a transaction or some other search time union that would make them useful as the one "logical" event that was intended. However, I can't seem to find any field that would allow me to join them. Any ideas?

0 Karma
1 Solution

drodman29
Path Finder

I'm getting good results with the following:
"SSO authentication API authenticate response" OR "}, {" | transaction host,_time keeporphans=false maxevents=3 startswith="SSO authentication API authenticate response

The assumption is that the events are sequential, and that the split events are getting the same timestamp of the logical parent . I empirically determined I wasn't getting more than 3 split events for my application. The additional implied assumption is that the same host is not logging a JSON like event (see the OR clause) at exactly the same time stamp as the parent/logical event timestamp.

View solution in original post

0 Karma

drodman29
Path Finder

I'm getting good results with the following:
"SSO authentication API authenticate response" OR "}, {" | transaction host,_time keeporphans=false maxevents=3 startswith="SSO authentication API authenticate response

The assumption is that the events are sequential, and that the split events are getting the same timestamp of the logical parent . I empirically determined I wasn't getting more than 3 split events for my application. The additional implied assumption is that the same host is not logging a JSON like event (see the OR clause) at exactly the same time stamp as the parent/logical event timestamp.

0 Karma

lguinn2
Legend

If there is no commonality in the fields, you can't "join" them by anything other than time proximity. (And you wouldn't need

This is unlikely to work accurately. But if you post some example data, we might think up a way to give it a try.

0 Karma

drodman29
Path Finder

Example cleaned up data, it is a log4j entry with a dumped json object embedded in it.
2015-06-09 12:17:58,169 INFO (mymodule.java:64) - SSO authentication API authenticate response:
{
"status" : "AUTHENTICATED",
"inactive" : false,
"login" : "here",
"domain" : "there",
"principal" : "someguid",
"otherPrincipal" : "someotherguid",
"method" : "mymethod",
"hashId" : "longstring",
"clientHost" : "1.1.1.1",
"subtoken" :{
subsubtoken" :{
... 400 more lines of json blah... listing a variable number of group memberships
}
}
}

I get at least one event with a line count of 257, then a second or more events depending on the number of lines in the json object - which is variable. It seems to be a pseudo random break point within the data structure.

0 Karma
Get Updates on the Splunk Community!

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...