Splunk Search

How to match a list of URL strings from a CSV file against indexed data if there is no extracted URL field in my events?

ejwade
Contributor

Against my events, I am trying to match a long list (2000 records) of malicious URL strings (e.g., hereisavirus.com) stored in a CSV file. One caveat - I do not have a "field" for URL in my events, so I am not able to use inputlookup and cross directly with a generated field.

Is there simple way to search the whole event in Splunk using a CSV file?

Thank you.

0 Karma
1 Solution

sundareshr
Legend

You could extract the URL into a field and then use (in)lookup to compare. Here is a very generic way you could extract the URL into a field

your base search | rex field=_raw "(?<URL>https?:\/\/(?:www\.|(?!www))[^\s\.]+\.[^\s]{2,}|www\.[^\s]+\.[^\s]{2,})" | lookup viruslist.csv URL AS URL OUTPUT someotherfield

This is not guaranteed to catch ALL URL patterns. Will need to see sample events to improve the probability of a match

View solution in original post

0 Karma

sundareshr
Legend

You could extract the URL into a field and then use (in)lookup to compare. Here is a very generic way you could extract the URL into a field

your base search | rex field=_raw "(?<URL>https?:\/\/(?:www\.|(?!www))[^\s\.]+\.[^\s]{2,}|www\.[^\s]+\.[^\s]{2,})" | lookup viruslist.csv URL AS URL OUTPUT someotherfield

This is not guaranteed to catch ALL URL patterns. Will need to see sample events to improve the probability of a match

0 Karma

ejwade
Contributor

Thank you, sundareshr.

So, I had created a custom Field extraction using the wizard:

^[^/\n]*/\d+\s+\d+\s+\w+\s+(?P[^ ]+)

When I run my base search, the field shows up.

I can also list my lookup table with the following command:

| inputlookup CCIC_URL.csv | rename Bad_URLs as destination_url | fields + destination_url

However, when I put them together using this search string:

base search | [| inputlookup CCIC_URL.csv | rename Bad_URLs as destination_url | fields + destination_url] | table _time, destination_url

I get the following error:

Redex: invalid UTF-8 string

The search job has failed due to an error.

Any thoughts on this issue?

0 Karma

ejwade
Contributor

Nevermind - figured it out. My data had characters that weren't translating correctly, when inputlookup looks for literals.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...