I'm an absolute Regex idiot. I'm sure this is easy if you know what you're doing.
I have an IIS log file, which is white space delimited. The 4th column contains the cs_uri_stem, eg the asset that the user requested. I'd like to filter out a few file extensions. Here's an example log
Fields: date time cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-substatus sc-win32-status time-taken
2017-07-25 16:01:09 GET /uploads/images/A3_2016_thumb.jpg - 80 - 172.17.73.1 Mozilla/4.0+(compatible;+MSIE+7.0;+Windows+NT+6.1;+WOW64;+Trident/7.0;+SLCC2;+.NET+CLR+2.0.50727;+.NET+CLR+3.5.30729;+.NET+CLR+3.0.30729;+.NET4.0C;+.NET4.0E;+InfoPath.3) http://www.google.com 200 0 0 31
I've followed this example ( https://scriptsahoy.wordpress.com/2012/03/08/splunk-fo-sharepoint/ ) and think I've got it all working up to the regex bit (I can null content using a "easier" regex that I understand 🙂 )
[iis-level-null]
REGEX = ^(?:[^\s+]+\s+){4}(?=.jpg|.gif|.jpeg)
FORMAT = nullQueue
DEST_KEY = queue
So the regex should find that 4th column and check if it has a file extension of .jpg, .gif, .jpeg etc
Thanks in advance!
Hi O2Anthony,
if you want to not index events where cs-uri-stem has extension jpg OR gif OR jpeg, you could use this regex in your transforms.conf.
^\d+-\d+-\d+\s+\d+:\d+:\d+\s\w+\s[^\.]*\.(jpg|gif|jpeg)
You can test it at https://regex101.com/r/LwECQW/1
Bye.
Giuseppe
Hi O2Anthony,
if you want to not index events where cs-uri-stem has extension jpg OR gif OR jpeg, you could use this regex in your transforms.conf.
^\d+-\d+-\d+\s+\d+:\d+:\d+\s\w+\s[^\.]*\.(jpg|gif|jpeg)
You can test it at https://regex101.com/r/LwECQW/1
Bye.
Giuseppe
Excellent - thank you!