While testing out Splunk I wanted to see if I could easily create a custom input into splunk using ncat and the UDP splunk input.
The input works, now I have to tell splunk how to split the input stream.
The input is a multiline string which contains either XML or pipe (|) delimited data but is always terminated by ~\
So I created a new props.conf in %$SPLUNK_HOME%/etc/system/local/ and added the following:
[source::c:\\splunkinput\\my.log]
LINE_BREAKER = ^~\$
Unfortunately nothing happens and I have not yet figured out how to check what exactly is going when importing a new file into splunk.
The end result should be for every sequence (with carriage returns etc) between ~\ should be considered a new event.
Any tips?
P.s. is there a way to activate the props.conf changes without restarting splunkd?
I think you simply want
[mysourcetype]
LINE_BREAKER = (~\\)
# You may need to increase this (default 100)
LINE_BREAKER_LOOKBEHIND = 1000
SHOULD_LINEMERGE = false
There are two things to consider here: 1.) Splunk wants a matching group in the LINE_BREAKER, and 2.) I'm not sure it's valid to end a regex with the backslash (\
) character. But I could be wrong.
I just re-read the question, and it sounds like you also want newlines to be split events. If that's correct, then try the following:
LINE_BREAKER = (~\\|[\r\n]+)
I have tried the following settings without success:
LINE_BREAKER = ~\\
LINE_BREAKER = ~\\^
LINE_BREAKER = ([~\\]+)
LINE_BREAKER = (.*)[~\\](.*)
LINE_BREAKER = .*~\\.*
An example string would be:
SMSEUCP_7110:STATUS:1049110|7116|7110|192.168.0.5
1180178|7112|7110|192.168.0.5
14156304|7111|7110|192.168.0.5
1180174|7117|7110|192.168.0.5
1180170|7119|7110|192.168.0.5
5767676|7113|7110|192.168.0.5
5308816|7114|7110|192.168.0.5
1573452|7115|7110|192.168.0.5
2426006|7118|7110|192.168.0.5
11141326|7110|7110|192.168.0.5~\SMSEMO_0000:S:(0000) Incoming : 3161234567 oh really? let do that then, ok?~\SMSEMO_0000:P:Posting : http://someurlwithparameters~\
The end result should be multiline events split by ~\ like so:
Event 1:
SMSEUCP_7110:STATUS:1049110|7116|7110|192.168.0.5
1180178|7112|7110|192.168.0.5
14156304|7111|7110|192.168.0.5
1180174|7117|7110|192.168.0.5
1180170|7119|7110|192.168.0.5
5767676|7113|7110|192.168.0.5
5308816|7114|7110|192.168.0.5
1573452|7115|7110|192.168.0.5
2426006|7118|7110|192.168.0.5
11141326|7110|7110|192.168.0.5
Event 2:
SMSEMO_0000:S:(0000) Incoming : 3161234567 oh really? let do that then, ok?
Event 3:
SMSEMO_0000:P:Posting : http://someurlwithparameters
I'm no regexp guru, but I thought this would be easier 😉
I've updated my answer based on the sample data. If that doesnt work, try playing around with some other line breaking settings in props.conf: http://www.splunk.com/base/Documentation/latest/Admin/Propsconf
In your regex you need to escape the backslash as such:
LINE_BREAKER = ^~\\$
If ~\
is not on a line by itself, drop the leading caret from your LINE_BREAKER definition:
LINE_BREAKER = ~\\$
I believe for event parsing configurations (such as LINE_BREAKER) you need to restart splunkd, however search time configurations (field extractions for example) in props.conf are applied automatically without having to restart Splunkd.
[EDIT Based on more info provided]
Based on the sample data, give the following a try in your props.conf:
[source::c:\\splunkinput\\my.log]
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE_DATE = false
MUST_BREAK_AFTER = ~\\
Hmm, can you use ^
in LINE_BREAKER
? I would think that you'd always need to use something like [\r\n]+
instead of ^
or $
... Just my 2 cents.. And after re-reading all this info, I don't think you want to use end-of-string ($
), start-of-string (^
), or traditional-end-of-line ([\r\n]
) stuff at all...