Splunk Search

How to do regex based on commas

JoshuaJohn
Contributor

I have data like this:

21,enrollmentgroup,19936,40:G6:7Q:G6:89:FG,,nitro - Circle.one10,Phone,11.1.11313,C,10/25/18 4:58,Enroll,enroll@ms.com,Movies & TV,microsoft,Public,YES,,2019.11111.10112.0,0,0,0,Installed,App installed by user,10/24/18 14:50,10/25/18 4:00

21,enrollmentgroup,19935,20:11:32:11:61:71,,nitro - Circle.one10,Phone,10.0.14393,C,10/25/18 4:58,Enroll,enroll@ms.com,PC Manager,c10f24324242432-43242,Internal,YES,11.5.1.11119,18.10.3.1009,0,0,0,Pending Install,Installed,10/24/18 14:49,10/24/18 22:09

443,S-0001,11222,01:11:7F:E1:D1:71,,nitro - loop.one12,Phone,10.0.14393,C,9/3/18 12:23,Enroll,enroll@ms.com, Mapping Better,631d-2e321-4312-b511f-83f2,Internal,YES,10.0.0.623,20.0.0.5921,0,0,0,Pending Install,Unkown,8/31/18 9:33,10/2/18 0:20

I wrote this but it appears to be too much for Splunk

rex field=_raw "(?<ig1>.+?)([,])(?<loc>.+?)([,])(?<ig2>.+?)([,])(?<MacAddress>.+?)([,])(?<modelType>.+?)([,])(?<ig3>.+?)([,])(?<OS>.+?)([,])(?<ig4>.+?)([,])(?<lastSeen>.+?)([,])(?<user>.+?)([,])(?<useremail>.+?)([,])(?<Application>.+?)([,])(?<ig5>.+?)([,])(?<deployed>.+?)([,])(?<public>.+?)([,])(?<assignedVersion>.+?)([,])(?<installedVersion>.+?)([,])(?<ig6>.+?)([,])(?<ig7>.+?)([,])(?<ig8>.+?)([,])(?<status>.+?)([,])(?<installedby>.+?)([,])(?<appFirstSeen>.+?)([,])(?<appUpdatedTime>.+?)([,]|$)"

Any suggestions? I cannot edit profs etc. I must do this via regex within the query.
Thanks,

0 Karma
1 Solution

jimmoriarty
Path Finder
| rex field=_raw "^(?<f1>[^,]+)?,(?<loc>[^,]+)?,(?<f3>[^,]+)?,(?<MacAddress>[^,]+)?,(?<modelType>[^,]+)?,(?<OS>[^,]+)?,(?<f7>[^,]+)?,(?<f8>[^,]+)?,(?<f9>[^,]+)?,(?<lastSeen>[^,]+)?,(?<user>[^,]+)?,(?<useremail>[^,]+)?,(?<Application>[^,]+)?,(?<f14>[^,]+)?,(?<deployed>[^,]+)?,(?<public>[^,]+)?,(?<assignedVersion>[^,]+)?,(?<installedVersion>[^,]+)?,(?<f19>[^,]+)?,(?<f20>[^,]+)?,(?<f21>[^,]+)?,(?<status>[^,]+)?,(?<installedby>[^,]+)?,(?<appFirstSeen>[^,]+)?,(?<appUpdatedTime>[^,]+)?"

This time with everything included...

View solution in original post

jimmoriarty
Path Finder
| rex field=_raw "^(?<f1>[^,]+)?,(?<loc>[^,]+)?,(?<f3>[^,]+)?,(?<MacAddress>[^,]+)?,(?<modelType>[^,]+)?,(?<OS>[^,]+)?,(?<f7>[^,]+)?,(?<f8>[^,]+)?,(?<f9>[^,]+)?,(?<lastSeen>[^,]+)?,(?<user>[^,]+)?,(?<useremail>[^,]+)?,(?<Application>[^,]+)?,(?<f14>[^,]+)?,(?<deployed>[^,]+)?,(?<public>[^,]+)?,(?<assignedVersion>[^,]+)?,(?<installedVersion>[^,]+)?,(?<f19>[^,]+)?,(?<f20>[^,]+)?,(?<f21>[^,]+)?,(?<status>[^,]+)?,(?<installedby>[^,]+)?,(?<appFirstSeen>[^,]+)?,(?<appUpdatedTime>[^,]+)?"

This time with everything included...

JoshuaJohn
Contributor

Awesome thank you!

0 Karma

ddrillic
Ultra Champion

It seems like a perfect csv format, except the missing header...

0 Karma

jimmoriarty
Path Finder

Try this: | rex field=_raw "^(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?,(?[^,]+)?"

0 Karma

JoshuaJohn
Contributor

I think this improved it slightly again, still getting the error though.

,(?<loc>.+?),.+?,(?<MacAddress>.+?),,(?<ModelType>.+?),.+?,(?<OS>.+?),.+?,(?<LastSeen>.+?),(?<user>.+?),(?<useremail>.+?),(?<Application>.+?),.+?,(?<deployed>.+?),(?<public>.+?),(?<assignedVersion>.+?),(?<installedVersion>.+?),.+?,.+?,.+?,(?<status>.+?),(?<installedby>.+?),(?<appFirstSeen>.+?),(?<appUpdatedTime>.+?)$
0 Karma

FrankVl
Ultra Champion

All those .+? are very unnecessarily complex. It matches everything and the regex processor constantly needs to check if perhaps it should stop capturing and continue reading the comma and proceed with the next capturing group.

As @jimmoriarty has already mentioned in his answer: replace the .+? by [^,]+ (so matching all characters except comma), which makes it much, much more efficient.

JoshuaJohn
Contributor

Made some changes to try and lower the greediness of my search

^(.+?),(?<loc>.+?),(.+?),(?<MacAddress>.+?),,(?<ModelType>.+?),(.+?),(?<OS>.+?),(.+?),(?<LastSeen>.+?),(?<user>.+?),(?<useremail>.+?),(?<Application>.+?),(.+?),(?<deployed>.+?),(?<public>.+?),(?<assignedVersion>.+?),(?<installedVersion>.+?),(.+?),(.+?),(.+?),(?<status>.+?),(?<installedby>.+?),(?<appFirstSeen>.+?),(?<appUpdatedTime>.+?)$

Still getting the error: has exceeded configured match_limit

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...