Splunk Search

Regex a field into more fields

packet_hunter
Contributor

For some reason the builtin field extractor is not working for me, and I am unable to successful create a .conf stanza to parse out some needed fields from ADFS logs. So I have an extracted field called Message that contains all the information to create the new fields I need.

Sample events are:

The following user account has been locked out due to too many bad password attempts. Additional Data Activity ID: 00000000-0000-0000-0000-000000000000 User: someone@ibm.com Client IP: 129.42.38.7,192.168.2.13 nBad Password Count: 6 nLast Bad Password Attempt: 1/8/2017 

The following user account has been locked out due to too many bad password attempts. Additional Data Activity ID: 00000000-0000-0000-0000-000000000000 User: ibm-9\1234 Client IP: 192.168.2.13 nBad Password Count: 6 nLast Bad Password Attempt: 1/9/2017 

The two events are similar except for User value and Client IP

What I would like to do is rex out all the information into

Msg = The following user account has been locked out due to too many bad password attempts.
Activity_ID= 00000000-0000-0000-0000-000000000000
Employee= someone
OR
Employee= 1234
Client_IP= 129.42.38.7,192.168.2.13
OR
Client_IP=192.168.2.13
Bad_Password_Count = 6
Last_Bad_Password = 1/8/2017

Here is my initial query

index=wineventlog sourcetype="WinEventLog:Security"  EventCode=516 | rex field=Message "(?<Employee>.+)@" | rex field=Message "(?<Msg>.+)." |table  Msg Employee _time

As you can I am using an already extracted field, to get Msg and Employee. I just need a regex Ninja to show me how to slice this up.

Thank you

BTW why do expressions in regex101 editor not work in the search app (and vice versa)?? Is there a tutorial on the differences?

Tags (1)
0 Karma
1 Solution

DalJeanis
SplunkTrust
SplunkTrust
| rex "^(?<Msg>.+?)\s+Additional Data" 
| rex "Activity ID:\s+(?<Activity_ID>[-0-9]+)\s" 
| rex "User:\s+(?<Employee>.+?)\s+Client IP:\s+(?<Client_IP>[\.0-9,\s]+?)\s+nBad")
| rex field=Employee "^(?<Employee>.+)@" 
| rex field=Employee "\\(?<Employee>.+)$" 
| rex "Bad Password Count:\s+(?<Bad_Password_Count>\d+)" 
| rex "Last Bad Password Attempt:\s+(?<Last_Bad_Password>[0-9\\]+)" 

To answer your questions about splunk vs regex101, it takes a bit of getting used to what to escape. In general, you are NOT escaping everything in regex101 that you need to escape in splunk.

So, as you can see above, I don't try to do everything in one pass, I break the whole message up into reasonable chunks. That is because if any one part of a regex fails it all fails, so I'd rather keep it local.

I don't assume that there will always be only one space after the colon in the data, so that's why I have \s+ in various spots.

When pulling a chunk of data, if I know the data type well enough to make a list of what are valid characters, then I will do so, so that the regular expression can slurp them up and stop when it gets to the invalid ones. For example, Client_IP should consist of 0-9, period, comma, and maybe an occasional space if it came in with a space after the comma. I put a question mark after the plus so that it will be lazy; if the regex encounters a space that isn't part of the IP section, then the space will be left to the chunk after it.

View solution in original post

0 Karma

DalJeanis
SplunkTrust
SplunkTrust
| rex "^(?<Msg>.+?)\s+Additional Data" 
| rex "Activity ID:\s+(?<Activity_ID>[-0-9]+)\s" 
| rex "User:\s+(?<Employee>.+?)\s+Client IP:\s+(?<Client_IP>[\.0-9,\s]+?)\s+nBad")
| rex field=Employee "^(?<Employee>.+)@" 
| rex field=Employee "\\(?<Employee>.+)$" 
| rex "Bad Password Count:\s+(?<Bad_Password_Count>\d+)" 
| rex "Last Bad Password Attempt:\s+(?<Last_Bad_Password>[0-9\\]+)" 

To answer your questions about splunk vs regex101, it takes a bit of getting used to what to escape. In general, you are NOT escaping everything in regex101 that you need to escape in splunk.

So, as you can see above, I don't try to do everything in one pass, I break the whole message up into reasonable chunks. That is because if any one part of a regex fails it all fails, so I'd rather keep it local.

I don't assume that there will always be only one space after the colon in the data, so that's why I have \s+ in various spots.

When pulling a chunk of data, if I know the data type well enough to make a list of what are valid characters, then I will do so, so that the regular expression can slurp them up and stop when it gets to the invalid ones. For example, Client_IP should consist of 0-9, period, comma, and maybe an occasional space if it came in with a space after the comma. I put a question mark after the plus so that it will be lazy; if the regex encounters a space that isn't part of the IP section, then the space will be left to the chunk after it.

0 Karma

packet_hunter
Contributor

Thank you for the responses. I appreciate your explanation of regex101 and the rex examples.
Just fyi, I had to rework some of the rex expressions but your examples helped me trigger some memories.
Here is what I finally came up with if anyone is interested.

index=wineventlog 

sourcetype="WinEventLog:Security"  

EventCode=516 

| rex field=Message "(?<employee>.+)@" 

|rex field=Message "\\\\(?<employee>.+)"

| rex field=Message "^(?<Msg>.+)"

| rex field=Message "Activity ID:\s+(?<Activity_ID>[-0-9]+)\s" 

| rex field=Message "Bad Password Count:\s+(?<Bad_Pswd_Count>\d+)"

| rex field=Message "Last Bad Password Attempt:\s+(?<Last_Bad_Pswd>[0-9\\\\].+)" 

|rex field=Message "Client IP:\s+(?<Client_IP>[\.0-9,\s]+?)\s+nBad" 

|table employee Msg Activity_ID Bad_Pswd_Count Last_Bad_Pswd Client_IP
0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Your <employee> lines both presume that there will only ever be an @ or a \ in that field, never anywhere else. Is that a valid assumption? Also, no, those won't work. It looks like the @ version will end up reading back to the beginning, since a period will match all the characters, and the \ version will read to the end for the same reason.

Try these:

 | rex field=Message "User:\s+(?[^@]+)@" 
 | rex field=Message "User:[^\\]*\\\\(?\S+)"

Your <msg> line will eat up the entire message until a "carriage return" or end of file. Okay?

0 Karma

packet_hunter
Contributor

The employee identifier will only be either User: someone@company.com or User: company-9\1234 and I am only concerned with "someone" or "1234" respectively.

I am not sure about the format of your rex expressions, perhaps you wrote them in the free regex101 editor. But my do work in the Search App.

Thank you

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...