Splunk Search

Rex question to extract a field

lehrfeld
Path Finder

I have been playing with the rex command for awhile now and I am stuck. I have a csv source that I need to extract a 1-3 digit number from. The trouble is that depending upon the line in the csv, it is either in the 8th or 9th column in the csv. Sample csv line:

2014/04/28 07:23:41 AM EDT,100.200.250.27,email link click,username@email.com,ABC,"CENTRAL, MAINE",x123456,204,PC,Windows,more,text

OR

8/28/2013 15:04,100.200.250.27,,username@email.com,ABC,DEF,x987654,23,,,IE 8,Mozilla,more

I would like to extract the 204 from the first line and 23 from the second. The trouble is the extra comma in the location in some of the lines (,"CENTRAL, MAINE",).

My base rex is | rex field=_raw ",{8,9}.{1,3} (?<AAA>.*)"

What I am thinking I am doing is searching for 8 or 9 commas ,{8,9} then when it finds it, select the next 1 to 3 characters .{1,3} then put all that in an extraction called AAA (?<AAA>.*)

I think I am just off, but any tips would be great!

Thanks, Mike

Tags (2)
0 Karma
1 Solution

curtisb1024
Path Finder

To make this work correctly in all situations, you need to have the regex handle when CSV fields are enclosed in quotes:

| rex "^(?:(?:\"(?:\"\"|.)*?\",)|(?:[^,]*),){7}(?P<AAA>[^,]+)"

If a field starts with a double quote, everything up to the next ", will be captured as part of that field.

This regex works for all the following examples:

2014/04/28 07:23:41 AM EDT,100.200.250.27,email link click,username@email.com,ABC,"CENTRAL, M"AINE" ",x123456,204,PC,"Windows",more,text
2014/04/28 07:23:41 AM EDT,100.200.250.27,email link click,username@email.com,ABC,"CENTRAL, MAINE",x123456,204,PC,"Windows",more,text
2014/04/28 07:23:41 AM EDT,100.200.250.27,email link click,username@email.com,"a"",",asdf,x123456,204,PC,"Windows",more,text
8/28/2013 15:04,100.200.250.27,,username@email.com,"",DEF,x987654,23,PC,,IE 8,Mozilla,more
8/28/2013 15:04,100.200.250.27,,username@email.com,ABC,DEF,x987654,23,PC,,IE 8,Mozilla,more
8/28/2013 15:04,100.200.250.27,,username@email.com,ABC,DEF,x987654,23,,,IE 8,Mozilla,more

Edit: Updated to handle double quotes inside quotes followed by a comma.

View solution in original post

curtisb1024
Path Finder

To make this work correctly in all situations, you need to have the regex handle when CSV fields are enclosed in quotes:

| rex "^(?:(?:\"(?:\"\"|.)*?\",)|(?:[^,]*),){7}(?P<AAA>[^,]+)"

If a field starts with a double quote, everything up to the next ", will be captured as part of that field.

This regex works for all the following examples:

2014/04/28 07:23:41 AM EDT,100.200.250.27,email link click,username@email.com,ABC,"CENTRAL, M"AINE" ",x123456,204,PC,"Windows",more,text
2014/04/28 07:23:41 AM EDT,100.200.250.27,email link click,username@email.com,ABC,"CENTRAL, MAINE",x123456,204,PC,"Windows",more,text
2014/04/28 07:23:41 AM EDT,100.200.250.27,email link click,username@email.com,"a"",",asdf,x123456,204,PC,"Windows",more,text
8/28/2013 15:04,100.200.250.27,,username@email.com,"",DEF,x987654,23,PC,,IE 8,Mozilla,more
8/28/2013 15:04,100.200.250.27,,username@email.com,ABC,DEF,x987654,23,PC,,IE 8,Mozilla,more
8/28/2013 15:04,100.200.250.27,,username@email.com,ABC,DEF,x987654,23,,,IE 8,Mozilla,more

Edit: Updated to handle double quotes inside quotes followed by a comma.

lehrfeld
Path Finder

Great call with the double quote - thanks!

0 Karma

somesoni2
SplunkTrust
SplunkTrust

This works fine with your example logs.

... | rex "(?i)^(?:[^,]*,){7,8}(?P<AAA>[^,]+)"

curtisb1024
Path Finder

This won't work for OPs second example if the field after the 23 contains a value.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...