How do I configure regex to get only test after each line's
I have a log file containing events like this:
PID: 3047
CurrentTime: 2012/01/20 16:23:55
Username: username45
Floor: floor7
IPADDRESS: 10.1.1.4
Result: success
CurrentTime: 2012/01/20 16:23:54
Username: username51
Floor: floor3
IPADDRESS: 10.1.1.32
Result: fail
PID: 8020
CurrentTime: 2012/01/20 16:23:53
Username: username67
Floor: floor8
IPADDRESS: 10.1.1.24
Result: success
Additional: Some more information
and props.conf includes the following configuraion.
[mytype]
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = ^PID:
EXTRACT-result = ^Result: (?P<result>.+)$
EXTRACT-ipaddress = ^IPADDRESS: (?P<ipaddress>.+)$
EXTRACT-floor = ^Floor: (?P<floor>.+)$
In order to get value sucha as floor8 for floor and 10.1.1.24 for ipaddress after :,
I ran the search for each field, but still getting unwanted information too.
sourcetype="mytype" | table result
result
--- ----------------------------------------
1 success
2 fail PID: 9360
3 fail PID: 6634
4 fail PID: 3908
5 fail PID: 1183
6 success PID: 8456
7 success PID: 5730
8 fail PID: 3004
9 fail PID: 278
10 fail PID: 7551
sourcetype="mytype" | table result
ipaddress
--- ----------------------------------------
1 10.1.1.21 Result: success
2 10.1.1.34 Result: fail PID: 9360
3 10.1.1.9 Result: fail PID: 6634
4 10.1.1.21 Result: fail PID: 3908
5 10.1.1.33 Result: fail PID: 1183
6 10.1.1.8 Result: success PID: 8456
7 10.1.1.20 Result: success PID: 5730
8 10.1.1.32 Result: fail PID: 3004
9 10.1.1.8 Result: fail PID: 278
10 10.1.1.20 Result: fail PID: 7551
sourcetype="mytype" | table result
floor
--- ----------------------------------------
1 floor7 IPADDRESS: 10.1.1.21 Result: success
2 floor1 IPADDRESS: 10.1.1.34 Result: fail PID: 9360
3 floor5 IPADDRESS: 10.1.1.9 Result: fail PID: 6634
4 floor9 IPADDRESS: 10.1.1.21 Result: fail PID: 3908
5 floor3 IPADDRESS: 10.1.1.33 Result: fail PID: 1183
6 floor8 IPADDRESS: 10.1.1.8 Result: success PID: 8456
7 floor2 IPADDRESS: 10.1.1.20 Result: success PID: 5730
8 floor6 IPADDRESS: 10.1.1.32 Result: fail PID: 3004
9 floor0 IPADDRESS: 10.1.1.8 Result: fail PID: 278
10 floor4 IPADDRESS: 10.1.1.20 Result: fail PID: 7551
How do I configure regex in props.conf to get only test after each line's
Thank you in advance
You need to activate multi-line mode matching for the regex by specifying (?m) at the start. Like this for instance:
EXTRACT-ipaddress = (?m)^IPADDRESS: (?P<ipaddress>.+)$
More information on multi-line mode matching in regular expressions:
http://www.regular-expressions.info/modifiers.html
http://www.regular-expressions.info/anchors.html
We have Splunk Enterprise 7.0.0.
I have a multiline event I am trying to configure a sourcetype for and was able to successfully test using regex101.com but I do not get the results in Splunk when setting up the sourcetype.
This example log has 400+ lines. I know the word to start and the word to end the match for the event. I just need to match the lines started with PRPM down to the line with the word END. I should also note that I had to add the MAX_EVENTS due to the length of the event data.
Example:
PRPM*28 blah blah blah blah blah
blah blah blah
blah ........blah
blah blah
....
..blah blah
END
This works on REGEX101.com but not in Splunk. (?s)^PRPM(.*?END)
I also tried with (?m). Suggestions?
Yes Ayn is correct. The non-greedy match fixes it although you should not need it. This config works for me:
[mytype]
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = ^CurrentTime:
EXTRACT-result = ^(?m)Result: (?P<result>.+?)$
EXTRACT-ipaddress = ^(?m)IPADDRESS: (?P<ipaddress>.+?)$
EXTRACT-floor = ^(?m)Floor: (?P<floor>.+?)$
BUT, so does this:
[mytype]
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE = ^CurrentTime:
EXTRACT-result = ^(?m-s)Result: (?P<result>.+)$
EXTRACT-ipaddress = ^(?m-s)IPADDRESS: (?P<ipaddress>.+)$
EXTRACT-floor = ^(?m-s)Floor: (?P<floor>.+)$
The problem appears to be that the 's' modifier is 'on' by default! It should not be if we're using PCRE.
The 's' modifier says that the '.' character will also match newline characters (i.e. \r or \n).
The first config above works because we are saying do a non-greedy match.
The second config above works because we are saying do not allow the '.' to match a newline char.
For some reason Splunk is behaving as though we had said (?sm). Anyway we have a fix!
Note I changed the BREAK_ONLY_BEFORE because the PID does not appear in every record.
Ayn solved your problem, I'm just clarifying was it didn't work.
Just another huge +1 for the -s. Very helpful.
Huge +1 for the "s" modifier. It had me stuck 🙂
Thanks for helpful comment!
You need to activate multi-line mode matching for the regex by specifying (?m) at the start. Like this for instance:
EXTRACT-ipaddress = (?m)^IPADDRESS: (?P<ipaddress>.+)$
More information on multi-line mode matching in regular expressions:
http://www.regular-expressions.info/modifiers.html
http://www.regular-expressions.info/anchors.html
Thank you for your help!
Now I can get what I wanted. I added "?" as you pointed out.
I'll admit I don't know why (?m) doesn't seem to work in your case - it should! Your second example could possibly work anyway if you changed the regex a bit - right now you're performing a greedy match so the regex will match as much as it possibly can. You need to change it to a non-greedy version by adding a ? at the end. Like this:
EXTRACT-ipaddress = (?m)^IPADDRESS: (?P<ipaddress>.+?)[\r\n]
Thanks, I tried to change to multi-line mode, but still no luck.
With EXTRACT-ipaddress = (?m)^IPADDRESS: (?P
"sourcetype="mytype" | head 1 | table ipaddress" still returns:
1 10.1.1.21 Result: success
2 10.1.1.34 Result: fail PID: 9360
3 10.1.1.9 Result: fail PID: 663410.1.1.21 Result: success
With EXTRACT-ipaddress = (?m)^IPADDRESS: (?P
"sourcetype="mytype" | head 1 | table ipaddress" still returns:
1 10.1.1.21
2 10.1.1.34 Result: fail
3 10.1.1.9 Result: fail
While I am reading the regex website, I would like to know how to get this right.