Splunk Search

Configure separate log entries to be associated by a common field, such as a session_id

smwilli1
Explorer

I have logs that come in the following format:

Sep 1 2014 12:00:00 UTC [13defc34] Client connected on IP 193.18.20.15

Sep 1 2014 12:00:30 UTC [ac21bf43] Client connected on IP 162.74.10.24

Sep 1 2014 12:01:15 UTC [13defc34] username 'johnsmith'

Sep 1 2014 12:01:30 UTC [13defc34] Authentication Failed: invalid username/password

Sep 1 2014 12:01:40 UTC [ac21bf43] username 'billsmith'

Sep 1 2014 12:01:55 UTC [ac21bf43] Authentication Succeeded: Assigned internal IP: 10.0.0.100

Sep 1 2014 12:02:35 UTC [13defc34] Session terminated unsuccessfully: bytes_in=1231 bytes_out=2134

...
...

Sep 2 2014 20:30:55 UTC [ac21bf43] Session terminated successfully: bytes_in=12213211 bytes_out=21323334

As you can see, I don't have the necessary details for the session in every line. In the case of user johnsmith, session '13defc34', the first entry shows his external_ip, second shows his username, third is a failed authentication, and lastly an entry identifying the end of the session with bytes statistics. This causes problems for me seeing that I have to transaction or eventstats every time i want to look for a successful authentication, ended session, authentication fails, etc.

I have preferred eventstats thus far, but i am starting to notice a significant flaw when using that. For instance, lets say I have a 24 hour search looking for bytes data, by using eventstats to associate the username and external IP address from earlier events. Well, if run this soon after billsmith's session is terminated, I won't be able to catch this result. Why? because his authentication happened the previous day, outside the 24 hour search window, and will not be able to associate his username or external IP.

My question is this: Is there a way in Splunk to associate main session details (user, externalIP, internalIP, etc.) with any subsequent event that comes in after that with that same sessionID? I'm fairly certain at this point that it cannot be done at search time for the same reason eventstats will not work for me.(Again, since some of the logs with these details could be outside of my search window).

I've looked into summary indexing, data models, and tscollect/tstats(Still not exactly clear how this works), and I cannot figure out a way to handle this data.

Bonus!: It would also be very helpful to add a field to every log of a session, identifying the status of the session at that instance. For example, once we have session details identifying the user and externalIP, the status would be set to "Pre-Authentication". Once splunk sees a log containing an internalIP, this status would change to something such as "Active" or "In-Progress". And this would end with a status of "Terminated Successfully" or "Terminated Unsuccessfully" at the last line identifying the bytes details.

Thanks in advance for any help!

0 Karma

sowings
Splunk Employee
Splunk Employee

Assuming your field is called "session_id", then you could do:

<search> | transaction session_id. Then you end up with a single "event" that is the combination of all of the raw events sharing the same session_id. You can then reference any of the fields from the combination as though they were part of the same event.

0 Karma

sowings
Splunk Employee
Splunk Employee

What is the cardinality of the number of sessions? You might consider a "growing lookup" where you agglutinate on the "interesting" fields for a given session into a lookup table and then purge it when the session is complete. If the cardinality of the number of sessions is not too high, then you might be able to use this approach.

0 Karma

smwilli1
Explorer

Can you expand on how to build a "growing lookup" or point to some material that may be able to instruct me on how to do this?

0 Karma

smwilli1
Explorer

The problem with transaction is the same as the issue I was having with eventstats. Unless my search window is large enough to include events from the beginning of the session, it wont capture certain fields I need to be able to reference(such as user & external_ip, at session initiation). Since I am searching over a large dataset, I need to limit the window I search in, otherwise the search times will be very slow. Also, in my environment there is no limit to how long a session can be open.

0 Karma

smwilli1
Explorer

I'm thinking this might could be accomplished with a LookupTable that I write certain fields to as I see events come in. All ive found on that is outputlookup, and I dont know how to specify certain fields when writing to a table.

Just a thought, look forward to hearing other opinions

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...