Splunk Search

How to limit primary search based on time interval from subsearch -- dynamic time interval ; time delta

TobiasBoone
Communicator

In order to query from an external firewall log that contains say "badwebsite.com" and join those results back through the internal firewall's NAT translation and ultimately join the internal 10. address back to DHCP I need to limit the external search to _time plus or minus ~5 minutes.

In an ideal world a feature enhancement would be added to the main search / join commands that would look like:

search sourcetype=dhcp -timedelta -+5m | join [search sourcetype=ext_fw badwebsite.com]

I have looked at using | eval to generate begin and end variables... but I cannot figure out how to do this elegantly. Help or votes for an enhancement would be greatly appreciated!

hexx
Splunk Employee
Splunk Employee

You can try the following method:

  • Have your subsearch find the latest event that you want to search around
  • Still in the subsearch, calculate the earliest and latest time boundaries you would like to use for the outer search based on the _time of that event
  • Have the subsearch return earliest and latest to the outer search

Here's an example showing how to retrieve a window of +/- 60s worth of events around the latest splunkd restart:

index=_internal source=*splunkd.log [
  search index=_internal "Splunkd starting"
  | head 1
  | eval earliest = _time - 60
  | eval latest = _time + 60
  | return earliest latest]

hexx
Splunk Employee
Splunk Employee

Another possibility might be to implement this as a view workflow:

  • A first panel would be driven by a search listing all anomalous events and present them as a table, one anomaly per row.
  • An in-view drilldown is available when the user clicks on one of the anomalous results, populating a secondary panel showing the events surrounding the anomaly.

We use such a workflow in the "Crashes" view of the S.o.S app, if you'd like to see an example.

TobiasBoone
Communicator

This is highly unfortunate. This type of iterative subsearch is necessary on so many levels for us; but writing python at the moment isn't in the cards.

RIAA/MPAA complaints
Tracking down botnets
Usage statists to the actual internal user
Appropriate use investigations
Wireless Access Point Utilization
802.1x supplicant tracking by errored machines

All of these things require finding a set of results and then correlating each of them to their respective set of sub data on or about that moment in time.

Please let me know

0 Karma

hexx
Splunk Employee
Splunk Employee

The outer search will only take one value for earliest and latest each, so to fulfill your request, a different approach will be necessary. I'm not quite sure if that is feasible with the search operators that are built-in today. You may need to write your own Python search command to iterate over the results of the subsearch and restrict the events returned by the outer search to corresponding pockets of time.

0 Karma

TobiasBoone
Communicator

This example specifically hopes to see who hit a site over the past 30 days, then based on that time interval +-10 minutes determine who had the IP leased from DHCP. This example modified slightly works great for nat/pat translations because the permutation of IP and Port don't over lap that much. With non natted networks however and 5 minute DHCP lease times with 22 thousand mobile devices the same person rarely has the same IP for more than a few minutes. The time delta is a crucial element in doing the join.

0 Karma

TobiasBoone
Communicator

I see where your thought process is; perhaps my question wasn't quite specific enough. Correct me if I am wrong but this looks like it would return results based off of 1 time interval. What I am looking to do is recurse through multiple items as I would through piping and get multiple time intervals returned. Ie:

index=main sourcetype=dhcp | join dhcp_ip [search index=ext_fw badsite.com earliest=-30d | rename ext_fw_url_cip as dhcp_ip | head 1 | eval earliest = _time - 600 | eval latest = _time + 600 | return earliest latest]

This is rather backwards logic because earliest and latest aren't being kicked back to the outside search in iterations. Am I missing something blatant?

I am so close to a working solution I just about can't stand it, but close only counts in horseshoes and hand grenades.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...