Hi All,
I want a single regex for multiple types of events getting generated in my access logs. I have written the following regex for extracting fields from my access.log :
^(?P[^ ]+)\s+(?P[^ ]+)\s+(?P[^ ]+)\s+\[(?P[^\]]+)[^ \n]*\s+url="(?P[^ ]+)\s+(?P[^ ]+)\s+(?P[^"]+)"\s+\|status=(?P[^ ]+)\s+\|size=(?P[^ ]+)\s+\|resp_time=(?P[^ ]+)\|\sreferer="(?P[^"]+)"\suser_agent="(?P[^"]+)"
The problem is most of the events are getting matched to this regex, but there are 4 events which are showing as "non-matches". Both of my events are :
Matching :
10.0.0.76 - - [20/Apr/2016:16:41:50 +0000] url="GET /dh/en-US/account/login HTTP/1.1" |status=302 |size=323 |resp_time=176| referer="-" user_agent="Python-httplib2/0.7.0 (gzip)"
Non matching is :
37.28.152.58 - - [19/Apr/2016:20:51:58 +0000] url="myversion|3.6 Public" |status=400 |size=312 |resp_time=125| referer="-" user_agent="-"
106.184.4.52 - - [17/Apr/2016:18:19:27 +0000] url="SSH-2.0-LYGhost_1.2.7-20100630" |status=302 |size=299 |resp_time=129| referer="-" user_agent="-"
In the above non-matching event - fields http_method & protocol are not there. Is there a way to write some conditions in the regex so that using the above regex should work with both the events? Please help.
Thanks
PG
You can have multiple field extractions for the same sourcetype, no problem. For each event, Splunk will attempt to apply all the regular expressions, and will use all of them that match.
Also, regular expressions in Splunk are unanchored.
Finally if you really want to use such a complex regular expression, I suggest that you use a regular expression tool to test it thoroughly.
You might also want to read the manual entries for creating and maintaining search time field etxractions.
Why must this be a single regex?
This is hard to understand, and that makes it fragile and hard to maintain - not to mention hard to get right in the first place!
So how this can be achieved. Can you please guide me. Actually both the events are from the same sourcetype. So how can we write different regex for different events ? and apply it ?
It will also make it hard to troubleshoot.
somehow the regex I copied above is not showing capture group name. Pasting the regex again :
^(?P[^ ]+)\s+(?P[^ ]+)\s+(?P[^ ]+)\s+\[(?P[^\]]+)[^ \n]*\s+url="(?P[^ ]+)\s+(?P[^ ]+)\s+(?P[^"]+)"\s+\|status=(?P[^ ]+)\s+\|size=(?P[^ ]+)\s+\|resp_time=(?P[^ ]+)\|\sreferer="(?P[^"]+)"\suser_agent="(?P[^"]+)"\siPlanetDirectoryPro="(?P[^"]+)"