When I'm sending in data over TCP, once in a blue moon Splunk will split one of the events into two parts, so I get the first portion of the text in one event, and the second in another.
Obviously this causes a lot of problems. As a guess I thought maybe Splunk might be able to tell the difference between EOF and mere linebreaks or something, so I tried setting various explicit LINE_BREAKER keys in the props stanza
I tried the following three values (separately of course) but none worked.
LINE_BREAKER=[\n]+
LINE_BREAKER=[\r\n]+
LINE_BREAKER=(\x00)<\d+>
In fact they all make matters worse in that they cause my events to get indexed multiline, with 57 lines per event. And that's even though I have SHOULD_LINEMERGE=False in the stanza.
The third LINE_BREAKER value btw I got from http://answers.splunk.com/questions/603/juniper-netscreen-tcp-syslog-messages-not-breaking-properly which seemed to have solved the problem over there.
So Im definitely doing at least one thing wrong. 😃
Is it just that the sending process is responsible for aligning it's TCP packets with linebreaks? Is my network just way flakier than a normal network should be?
Is it possible to make this problem go away with some key that tells the tcp input to be a little patient and wait a few seconds somehow?
(btw this is a 64-bit splunk running on windows 7)
What is making the TCP connection? Splunk will break at the close of a TCP stream. Individual packets within a connection/stream are not broken, but it seems likely to me that your TCP connection may not be persistent and may in fact be closing and reopening a new one.
Network flakiness is not likely to be the problem (unless, say, it's forcibly terminating connections), as TCP should be able to handle even pretty severe packet loss at the IP level. What is more likely is an odd way that the client is creating or managing the TCP socket/stream/connection.
I faced the similar problem on TCP and the same worked fine when sent as File.
The solution I found is to use the below in props.conf
SHOULD_LINEMERGE=false
now all the events are broken properly ....
What is making the TCP connection? Splunk will break at the close of a TCP stream. Individual packets within a connection/stream are not broken, but it seems likely to me that your TCP connection may not be persistent and may in fact be closing and reopening a new one.
Network flakiness is not likely to be the problem (unless, say, it's forcibly terminating connections), as TCP should be able to handle even pretty severe packet loss at the IP level. What is more likely is an odd way that the client is creating or managing the TCP socket/stream/connection.
Right now Im mocking something up with a script that pipes some data to netcat over my home network every few minutes. It'll be fine for a while though, for many cycles and then terrible for a while where a quarter of the events are getting broken. Thanks Gerald.
Hey Nick,
"Is it possible to make this problem go away with some key that tells the tcp input to be a little patient and wait a few seconds somehow?"
See if this helps:
http://www.splunk.com/base/Documentation/4.1.5/Admin/Inputsconf
Specifically:
time_before_close = <integer>
* Modtime delta required before Splunk can close a file on EOF.
* Tells the system not to close files that have been updated in past <integer> seconds.
* Defaults to 3.
Cheers!
yeap, that is most probably the case...
It doesnt seem to have any effect. It seems likely that that key is only valid within monitor:// inputs.
The LINE_BREAKER should have a regex capturing group. You could try one of these variants:
LINE_BREAKER=([\v\x00]+)
LINE_BREAKER=(\v+)
LINE_BREAKER=(\x00+)
(\v
is a vertical whitespace)
timestamp doesn't matter. LINE_BREAKER processing occurs first, then timestamp extraction processing (then line merging, but you don't have SHOULD_LINEMERGE enabled, so this is skipped).
not long at all. Only unusual thing is that the timestamp comes at the end, but since the extra breaks can come anywhere in the event text I dont think it's related.
Too bad. How long are those events? Can you post examples?
Im afraid they dont work. Tried each in between cleans and restarts. The first two still have the sporadic breaking behavior. And the last one causes the 57-line multiline aggregation.