Splunk Search

Joining two indexes, why am I not getting a complete set of events with my current search?

daniel_augustyn
Contributor

I've been trying to join two indexes: Windows Security index and a proxy one, but after running the search below, I only get some logs back, very little, or almost nothing. The idea is to build a search to monitor proxy logs for some selected users. However, user names are in Windows Security logs, so that's why I need to join these two indexes together to be able to search proxy logs based on the user names.

index=wineventlog user_name  | join src  [ search index=proxy    | fields _time, category]|table category
0 Karma
1 Solution

maciep
Champion

a splunk join works a lot like a sql join. You're essentially combining the results of two searches on some common field between the two data sets. Is that we're you're trying to do here? Does the src field from wineventlog data match the category from the proxy data? If that's the goal then the field names need to match:

index=wineventlog user_name  | join src  [ search index=proxy   | rename category as src | fields _time, src]|table src

But that seems overall pointless, since you're not really using any elements of the join in your final results.

Or do you just want to dump all of the wineventlog events for a user into the proxy data for the same timeframe and to try to correlate user to proxy event? If that's what you want, then you can do an append. Or maybe just specify both indexes in your main search?

(index=wineventlog AND user_name) OR index=proxy

View solution in original post

0 Karma

maciep
Champion

a splunk join works a lot like a sql join. You're essentially combining the results of two searches on some common field between the two data sets. Is that we're you're trying to do here? Does the src field from wineventlog data match the category from the proxy data? If that's the goal then the field names need to match:

index=wineventlog user_name  | join src  [ search index=proxy   | rename category as src | fields _time, src]|table src

But that seems overall pointless, since you're not really using any elements of the join in your final results.

Or do you just want to dump all of the wineventlog events for a user into the proxy data for the same timeframe and to try to correlate user to proxy event? If that's what you want, then you can do an append. Or maybe just specify both indexes in your main search?

(index=wineventlog AND user_name) OR index=proxy
0 Karma

daniel_augustyn
Contributor

I've been trying to build alerts/dashboard for let's say 10 user names. The proxy logs don't have user names in their logs, so I thought I would need to join proxy logs with the logs which have user names, Windows Security logs in this case. But I am not completely sure how to approach this problem.

0 Karma

maciep
Champion

Oh, so you want to find out which users are logged onto which ips in the windows event log, and then correlate that with the proxy logs? Do your users move from pc to pc often? How often do IPs change?

It might be possible to run a search on a schedule for authentication events in wineventlog and create a lookup of user to ip. And then use that lookup with proxy logs to tie ip back to user. But would be important to understand where that logic might fail or be misleading.

0 Karma

daniel_augustyn
Contributor

DHCP leases are pretty short in our environment. I think this might not work, also VPN users would fail as well since their IPs changes even more often. So joining two indexes would not work in this case?

0 Karma

maciep
Champion

Well, you have a technical problem right now. I mean, if you were tackling this problem manually, how would you go about it? If you had the event log data on one monitor and the proxy data on another, how would you determine which user to associate a proxy event with? It sounds like there's no way for you to logically make that connection, right?

If you can't, Splunk probably isn't going to help. Joining disparate groups of data together won't solve the problem.

If you can figure out a way to use those 2 data sets together to somehow make that connection, then Splunk might be able to help automate/expand that for you.

daniel_augustyn
Contributor

And to answer your question, if you have user data on one monitor and the proxy logs on the other, you would make connections by looking at the source IP address.

0 Karma

maciep
Champion

Oh ok, I thought you were saying the IP-to-User data was unreliable, but it sounds more like it just may change often.

That might be doable, but outside of my ability at this point. I'm not really sure how to put a search together to figure out which user was logged into an ip address at the time of a corresponding proxy event.

In a simpler of case of maybe finding the last user for an ip and comparing that to a proxy event, that might be a bit easier. Maybe not easy, but easier.

Something along these lines (which is really just pseudo code)

index=proxy | table ip src dest action | join ip [index=wineventlog | stats latest(user) by ip]

And if you don't have the ip in the event log data (just a name), then you can use splunk's lookup command to resolve against your dns to an ip.

I know that's probably not the result you were after, but it might be the best I can do.

0 Karma

maciep
Champion

hmm, or maybe just an append might be a good start too?

index=proxy | append [index=wineventlog] | sort _time | search ip=some_ip

Append will just put the results of the second search after the results of the first. Then you can sort by time so the events are in order. And if you want filter down to just one ip. You might still have to sift through the results but at least they'd be on one monitor.

And I feel like with data together like that, there is probably a better way to transform the data to exactly what you want...just not coming to me at the moment..

0 Karma

daniel_augustyn
Contributor

Thank you, I will keep searching for best solution. But your points might help me actually.

0 Karma

daniel_augustyn
Contributor

I really think it's not that hard to do it manually. That's how all investigations are done. You are looking up a specific user's IP address and then search proxy logs for it using this specific IP address. You also want to make sure that you would note the timeframes for this IP address to not search proxy logs for some other users who was on the same IP but in different. How hard is that?

How can we automate it with Splunk? If I only know the user name, not his/her IP address, how can I search the proxy logs when there is no user name in these logs but their IP addresses?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...