Splunk Search

How to efficiently query all indexes for a list of IPs

asearson
Explorer

BACKGROUND: My Disaster Recovery team is compiling a list of all IPs endpoints, and has requested that I query all of my Splunk Events (in all Indexes) for anything resembling an IP. I created the following search, which works under my smaller-Staging Splunk-Enterprise, but fails out when I attempt it in my larger-Production Splunk-Enterprise:

index="*" earliest=-1d@d latest=-0d@d
| rex field=_raw "(?<ip>\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)"
| stats values(ip)

As a workaround to avoid the timeout, I've split the Production search into multiple searches of each Index.

QUESTIONS:

  1. Is there a more efficient way to get the IPs my DR wants?
  2. If there an efficient way to Join the results of the the multiple Index searches in Prod?
0 Karma

bowesmana
SplunkTrust
SplunkTrust

I'm assuming the regex is fine, as you seem happy with that, so in terms of efficiency, if this is a one-off operation, does efficiency matter?

Your query is searching yesterday. Is the intention that it searches further back than that? Could you just run a backfill operation and let Splunk handle the scheduling?

If you're looking for a general solution, then you could output each production index search to a CSV (outputlookup append=t) and then after running all the searches, just inputlookup the csv and stats count on the data.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi asearson,
I cannot check your regex because you didn't shared an example so i take it as good.
Anyway, for the list all the IPs you should use dedup and table commands:

index="*" earliest=-1d@d latest=-0d@d
| rex "(?<ip>\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b)"
| dedup ip
| sort ip
| table ip

I have only one doubt: you want all the IPs of all indexes, but different sourcetype have usually different log formats, so how do you think to extract IPs with one regex from all sourcetypes?

Maybe you could use a different approach:
for servers, you could use nslookup to extract IPs from the DNS passing hostnames in this way:

index=_internal
| dedup host
| lookup nslookup clienthost AS host OUTPUT clientip
| sort host
| table host clientip

For appliances with standard syslog, you can extract IPs using an appropriate regex because it's always in the same site.
Appliances that haven't standard syslog usually have the IP in the hostname.

Ciao.
Giuseppe

asearson
Explorer

Thanks for the reply, but not exactly the answer I'm looking for...

CLARIFICATION OF MY PROBLEM STATEMENT:
I need to capture every IP found in all logs, regardless of Index/host/source/sourcetype. A single weblog from a busy webserver could yield 1000's of IPs for each unique client requesting a popular webpage. I'm not concerned about Hostnames.

CLARIFICATIONS TO YOUR QUESTIONS:
Example is anything between 0.0.0.0 and 255.255.255.255.
Regex taken from www.regular-expressions.info/ip.html and verified with regex101.com

The idea for "rex field=_raw" is taken from this:
https://answers.splunk.com/answers/656616/how-to-extract-ip-address-using-regex.html
It is applying to every RAW event, regardless of sourcetype or log format.

TESTING:
I tested your pipeline "| dedup ip | sort ip | table ip" , and job-inspector shows that it actually takes longer than the single "| stats values(ip)" pipe. They yield the same results, with slightly different sort (string rather than Integer)

0 Karma

bowesmana
SplunkTrust
SplunkTrust

sorting is a bad idea, 'sort' without '0' will truncate at the sort limit (default 10000)

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...