Getting Data In

Sending from syslog-ng to Splunk via TCP and having event merging issues

mspiegle
New Member

My clients are sending logs to splunk via syslog-ng using the following destination:

destination d_splunk-test-tcp {
  tcp('splunk-host' port(1234) template('$ISODATE $HOST $PROGRAM[$PID]: $MSG'));
};

My logs look something like this, but larger at around 1000-1500 characters:

2013-07-26T13:57:03+00:00 client-host some_process[12345]: [INFO] This is a message

So far, the only way I've been able to get it to work perfectly is to add some unique characters at the end of the message, then use splunk's LINE_BREAKER to separate based on that unique text. I suppose this could work, but it seems like a poor solution.

My research tells me that date/time recognition must not be working, but I can't find any combination of TIME_FORMAT directives that seem to work. The closest I've been able to get is that all of my events will start with the timestamp, but some of the log messages get merged together in one event. If I have really short messages, they usually end up in an event by themselves. I thought this was an issue with MAX_TIMESTAMP_LOOKAHEAD, but that didn't solve the problem either.

Here's what my props.conf looks like:

[testing]
TIME_FORMAT = %Y-%m-%dT%H:%M:%S%:z
MAX_TIMESTAMP_LOOKAHEAD = -1

Performance isn't a big deal at the moment. I've got a bunch of servers I can throw at this if needed. What else can I try to keep my messages from getting merged together?

EDIT, Here's some more info:

I wrote this simple script to send 10 messages rather quickly to syslog-ng, which then forwarded the messages to splunk:

# for NUM in {1..10}; do echo "message ${NUM}" | logger -i -t testing; done

I confirmed with tcpdump that each message was sent in a separate packet. Here's a snippet of a few of the packets (tcpdump -n -nn - ieth0 -s0 -A port 9876):

10:11:13.487966 IP 10.144.128.24.18446 > 10.144.9.73.9876: P 63:126(63) ack 1 win 14600
E..g..@.>..3
...
.       IH.&.....CLc.P.9.....2013-07-26T17:11:13+00:00 las1-app077 testing[17191]: message 1
10:11:13.487977 IP 10.144.9.73.9876 > 10.144.128.24.18446: . ack 126 win 5840
E..(..@.@..|
.       I
...&.H.CLc.....P.......
10:11:13.489953 IP 10.144.128.24.18446 > 10.144.9.73.9876: P 126:189(63) ack 1 win 14600
E..g..@.>..2
...
.       IH.&.....CLc.P.9..y..2013-07-26T17:11:13+00:00 las1-app077 testing[17193]: message 2
10:11:13.489960 IP 10.144.9.73.9876 > 10.144.128.24.18446: . ack 189 win 5840
E..(..@.@..{
.       I
...&.H.CLc....DP.......
10:11:13.491520 IP 10.144.128.24.18446 > 10.144.9.73.9876: P 189:252(63) ack 1 win 14600
E..g..@.>..1
...
.       IH.&....DCLc.P.9..:..2013-07-26T17:11:13+00:00 las1-app077 testing[17195]: message 3

And here's what it looks like in splunk. Please note that this is all a single event in the splunk UI:

2013-07-26T17:11:13+00:00 las1-app077 testing[17191]: message 12013-07-26T17:11:13+00:00 las1-app077 testing[17193]: message 22013-07-26T17:11:13+00:00 las1-app077 testing[17195]: message 32013-07-26T17:11:13+00:00 las1-app077 testing[17197]: message 42013-07-26T17:11:13+00:00 las1-app077 testing[17199]: message 52013-07-26T17:11:13+00:00 las1-app077 testing[17201]: message 62013-07-26T17:11:13+00:00 las1-app077 testing[17203]: message 72013-07-26T17:11:13+00:00 las1-app077 testing[17205]: message 82013-07-26T17:11:13+00:00 las1-app077 testing[17207]: message 92013-07-26T17:11:13+00:00 las1-app077 testing[17209]: message 10
Tags (3)
0 Karma

mloven_splunk
Splunk Employee
Splunk Employee

mspiegle,

I'm not sure how to match the timezone offset with the colon in it. You've used %:z, though the link that you posted mentions only %z, %::z and %:::z. I've used %z (which matches offsets like +0400), but not the others. It may work as you've got it if the events were linebreaking properly.

You might want to try this in your props.conf stanza:

LINE_BREAKER = ([\r\n]+)(?\d{4}-\d{2}-\d{2}T)

Assuming you keep the same date format.

And I'd throw that MAX_TIMESTAMP_LOOKAHEAD back to 25, but that's just the OCD personality in me speaking... 🙂

0 Karma

Ayn
Legend

I'll agree that timestamp recognition is the most likely culprit. Your TIME_FORMAT string looks fairly OK up until the last part. %:z - a typo or on purpose? If on purpose, I'm guessing it could be the time offset with a colon in some strftime "dialect"? I'm pretty sure Splunk doesn't support this - Python doesn't and Ruby doesn't as far as I know at least.

For troubleshooting purposes I'd skip the last part of the time format string and just go with %Y-%m-%dT%H:%M:%S to make sure that works before starting to solve how to deal with the offset.

Oh and if things are still not working, some sample events would help so we can try and figure out why your events are getting merged - I really think it's due to the failing timestamp recognition though.

0 Karma

mspiegle
New Member

I updated the original question with more data

0 Karma

mspiegle
New Member

It's supposed to be the syntax for trailing timezones like +00:00 (which you can see in my log example). I found the syntax for it here:

http://docs.splunk.com/Documentation/Splunk/4.2.3/Data/Configuretimestamprecognition#Enhanced_strpti...

http://splunk-base.splunk.com/answers/12147/time-zone-in-time_format

I tried using %Y-%m-%dT%H:%M:%S as you suggested and even limited my LOOKAHEAD to cut out the timezone identifier, but that didn't seem to work.

I'm working on creating a separate source that I can log to specifically for troubleshooting. I'll report back more soon.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...