Splunk Search

How do I do lookups based on many forms of regex/event?

jblaine
Explorer

I'm having no success making sense of lookups. Some work, some don't, and I can't figure out why. Let's take an obvious example. sshd syslogs in all sorts of formats which indicate the username. I want to extract the username field from those various forms, then look that username up in my external CSV file. I know how to get that working in basic form, and have done it for one form of sshd syslog line.

Specifically, we have sshd events like:

user usernameHere authenticated as blahblah
session opened for usernameHere
session closed for usernameHere
Accepted someAuthMethod for usernameHere

All of those (defined as field extractions) need to trigger a lookup of usernameHere in the CSV file which is already defined in transforms.conf as 'employee'

The following does not work completely (only the "authenticated as" part looks up):

[syslog]
LOOKUP-username1 = employee uid AS Username1
EXTRACT-Username1 = (?i) for (?P<Username1>[^ ]+)
LOOKUP-username2 = employee uid AS Username2
EXTRACT-Username2 = (?i) (?P<Username2>[^ ]+) authenticated as

Nor does this ordering (a shot in the dark):

[syslog]
EXTRACT-Username1 = (?i) for (?P<Username1>[^ ]+)
EXTRACT-Username2 = (?i) (?P<Username2>[^ ]+) authenticated as
LOOKUP-username2 = employee uid AS Username2
LOOKUP-username1 = employee uid AS Username1

If I remove the functioning "authenticated as" LOOKUP and EXTRACT, then the other one starts working.

I have also tried the following, fixing the case of my LOOKUP classes:

[syslog]
EXTRACT-Username1 = (?i) for (?P<Username1>[^ ]+)
LOOKUP-Username1 = employee uid AS Username1
EXTRACT-Username2 = (?i) (?P<Username2>[^ ]+) authenticated as
LOOKUP-Username2 = employee uid AS Username2

So clearly I am not understanding the relationship between the field extraction and the lookup.

Really what I want is:

my_sshd_extraction1 to store username
my_sshd_extraction2 to store username
my_sshd_extraction3 to store username
my_sshd_extraction4 to store username
lookup username for any of those!

Any help would be greatly appreciated.

0 Karma
1 Solution

lguinn2
Legend

My suggestion is - Edit props.conf and change all the Username1 Username2 etc. to just Username, like this

[syslog]
EXTRACT-U1 = (?i) for (?P<Username>[^ ]+)
EXTRACT-U2 = (?i) (?P<Username>[^ ]+) authenticated as
LOOKUP-username = employee uid AS Username  

Note that the different extraction identifiers must be unique - but the field name itself can be the same. This is good, because it really is the same field, it just appears in different places in different events.

Now you only need one lookup, on the Username field. Note that I also renamed the lookup to LOOKUP-username, although the lookup identifier really doesn't matter.

I think this solution will make searches and reporting easier overall, as well as simplifying your lookups.

View solution in original post

lguinn2
Legend

My suggestion is - Edit props.conf and change all the Username1 Username2 etc. to just Username, like this

[syslog]
EXTRACT-U1 = (?i) for (?P<Username>[^ ]+)
EXTRACT-U2 = (?i) (?P<Username>[^ ]+) authenticated as
LOOKUP-username = employee uid AS Username  

Note that the different extraction identifiers must be unique - but the field name itself can be the same. This is good, because it really is the same field, it just appears in different places in different events.

Now you only need one lookup, on the Username field. Note that I also renamed the lookup to LOOKUP-username, although the lookup identifier really doesn't matter.

I think this solution will make searches and reporting easier overall, as well as simplifying your lookups.

lguinn2
Legend

Stanza names in props.conf aren't normal regexes. Here are the rules:

When setting a [] stanza, you can use the following regex-type syntax:

... recurses through directories

* matches anything but / 0 or more times

| is equivalent to 'or'

( ) are used to limit scope of |

So [syslog|linux_secure] should work. This is either a bug in the code, or an error in the documentation.

Question: where do you set the sourcetypes of syslog and linux_secure? inputs.conf? If it's in props.conf, you need to look at the priority and ordering of stanzas in props.conf

0 Karma

jblaine
Explorer

PS: Comment formatting controls here at Answers are greatly needed.

0 Karma

jblaine
Explorer

Awesome. That works. Now the problem is that [syslog|linux_secure] isn't working. If I break my stuff out (duplicate the field extraction definitions) into [syslog] and also [linux_secure], they all work. Combined with an 'or' pipe, they don't.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...