I'm trying to leverage my indexed DHCPD logs to provide additional information about internal IP's that show up in other events. Specifically, I want to lookup the src_mac and src_host field and add them to the event (for example, a firewall event). This seems pretty easy with an external dynamic lookup, but since I've already indexed the data, I'd like to leverage it.
What I think I want is something like a correlated subquery in SQL (have the subsearch look for the src_ip
specific for an event), but it sounds like Splunk search doesn't work that way.
I've tried a few different methods, but none seem to be quite right.
Join/Subsearch method (This is slow, and hits the subsearch limits, so doesn't seem to be the right way to do it):
sourcetype=someSourcetype | join src_ip usetime=true earlier=true [search eventtype="dhcpd_server_dhcpack" src_ip=* src_mac=* | fields _time src_ip src_mac src_host]
Appended search Transaction Method (Requires the ip looking up to be specified for both searches, which doesn't work for what I'm trying to do):
sourcetype=someSourceType src_ip=192.168.1.1 | append [search eventtype="dhcpd_server_dhcpack" src_ip=192.168.1.1 | fields src_ip src_mac src_host] | transaction src_ip
Combined Transaction Method (If I don't specify the src_mac
, it doesn't detect device changes on the IP. If I do, it doesn't seem to work correctly either):
(sourcetype=someSourceType) OR (eventtype="dhcpd_server_dhcpack") | transaction src_ip src_mac | table src_ip threat_id _time src_mac src_host
Any suggestions?
You want a join with a sub search, or a lookup (if you maintain a lookup table of ip->mac address)
You want a join with a sub search, or a lookup (if you maintain a lookup table of ip->mac address)
It's interesting..this seems to work for very recent events (with the last 8 hours), but when I got outside of that window, there are no results. I'm guessing it has something to do with the bucketing..Any other ideas?
That seems to work, and reasonably fast! Thanks for the help. I'm working through comparing to my other systems to make sure that everything lines up.
The one tweak is that you have an extra ) after src_hosts.
Do you really have more than 10000 pairs of ip/mac address ?
To reduce that, you can use a time bucketing per hour to avoid doing a time resolution for every single timestamp.
sourcetype=someSourcetype | eval timerange=_time | bucket timerange span=1d | join src_ip timerange [search eventtype="dhcpd_server_dhcpack" src_ip=* src_mac=* | eval timerange=_time | bucket timerange span=1d | stats values(src_mac) AS list_src_mac first(src_mac) AS src_mac values(src_mac_hosts) AS list_src_hosts first(src_host) AS src_hosts) by src_ip timerange ]
The join/sub-search is my first method. The volume of DHCP Accept messages is high, so I quickly hit the max number of events.