Getting Data In

syslog with linebreak

EricPartington
Communicator

I am having difficulty getting linebreaking working for a particular type of syslog messages. I have looked at http://www.splunk.com/base/Documentation/latest/Admin/Indexmulti-lineevents to try to figure out the custom linebreak settings, but so far when splunk encounters a x0a (\n) in the messages it breaks the event into multiple smaller ones. I would like to keep them all as one message.

maybe its my props.conf settings that are not in the proper order or i am misunderstanding the documentation above. please point out the flaw in my thoughts if you see them.

props.conf
[source::udp:514]
#set the sourcetype for SOURCEA
TRANSFORMS-SOURCEA = SOURCEA_sourcetype

[SOURCEA]
#to deal with the line breaks in the messages
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE_DATE = True
#break on events like Jun 07 21:22:34
TIME_FORMAT = %b %d %T

the messages come with that format from the device, however splunk shows this in the event listing:

Jun 8 16:35:56 x.x.x.x Jun 08 16:34:06 DEVICENAME 

So would the date parsing be?

#if that doesnt work try the other date format Jun 7 21:24:18
#TIME_FORMAT = %b %e %T

Does the BREAK_ONLT_BEFORE tag have to occur at the beginning of the message or just in the first part of the message that is not broken up? Does there have to be a match () in the statement or just text that matches the syslog message.

any hints or suggestions are welcome.

Tags (1)
0 Karma
1 Solution

gkanapathy
Splunk Employee
Splunk Employee

you need to set LINE_BREAKER, which is by default set to ([\r\n]+). I am not certain that you should be trying to keep lines with newlines together though. How exactly are you getting this syslog to Splunk, and how exactly are you planning to break events if not with a newline character? Do you actually have lines in syslog that contain dates without newlines before them, or newlines without dates after them?

furthermore, all the linebreak/linemerge settings should be in props.conf, not transforms.conf.

View solution in original post

0 Karma

sjloh17
Explorer

Hi Everyone,

I'm also trying to do the same thing here!

I need to be able to filter out specific syslog messages (in this case, via assigning a new sourcetype). But understand that setting sourcetype at index time via TRANSFORMS does not select rules in props based on the new sourcetype at index time, and it only affects search time.

We are unable to use a different input port as everything syslog related has to go through UDP:514. Any other suggestions that might support this?

MasterSplunker, did you manage to get your config working??

0 Karma

EricPartington
Communicator

Unfortunately i havent been able to put enough energy into this problem to solve it. I will have to tackle it early in the new year, when i solve it i will post back.

0 Karma

Lowell
Super Champion

By match() I assume your asking about regular expression "search" vs "match" behavior. In almost all cases where splunk uses regexes, it uses the "search" function and not the match function. This allow for full flexibility, since any "search" regex can be turned into a matching one by simply using ^regex$.

0 Karma

Lowell
Super Champion

By default splunk will search for the timestamp format specified in TIME_FORMAT anywhere at the start of your event, so it could match either as you suspect. (Technically, splunk looks for the timestamp MAX_TIMESTAMP_LOOKAHEAD characters into each event, which by default, is 150 characters.) If you want splunk to only use the first occurring timestamp, then you can do that simply with TIME_PREFIX entry, example shown below.

Help me out with one thing. What's your TRANSFORMS-SOURCEA entry about? Why not simply set your sourcetype directly? The way you have this setup, you are rely on some (unshown) SOURCEA_sourcetype stanza in transforms.conf, which I assume you are using to set MetaData::Sourcetype to sourcetype::SOURCEA, which is fine, except that doing so will prevent any props settings in the [SOURCEA] stanza from taking effect at index time (and we are only looking at index-time settings in this discussion. Specifically, we are taking about line breaking, line merging, and timestamp extraction; and all of these are done before your sourcetype is reassigned with your transformer). However, if you simply set your sourcetype directly (sourcetype=SOURCEA) then the props settings in both your [source::udp:514] and your [SOURCEA] stanza will be used. Here is what I would recommend:

Example props.conf entires:

[source::udp:514]
sourcetype = SOURCEA

[SOURCEA]
SHOULD_LINEMERGE = True
BREAK_ONLY_BEFORE_DATE = True
TIME_FORMAT = %b %d %T
TIME_PREFIX = ^

BTW, if you want to match the second date, you could use a regex to match the first portion of your event which would cause it to be ignored by the timestamp recognizer and therefore pickup the second date. For example:

TIME_PREFIX = ^... .. ..:..:.. \S+\s

If you have some kind of complex sourcetype renaming logic because you are receiving different types of events via this single udp input, then a slightly more complex approach will be necessary. If you provide some additional insight into the types of events you need to handle separately, then someone here can provide some additional recommendations. One thing thats important to note is if all of your inbound events have the same timestamp format. If not, then your event breaking logic based on timestamp gets more complicated. You need to use some other props option, or you could use different input ports, if your system can support that configuration.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

you need to set LINE_BREAKER, which is by default set to ([\r\n]+). I am not certain that you should be trying to keep lines with newlines together though. How exactly are you getting this syslog to Splunk, and how exactly are you planning to break events if not with a newline character? Do you actually have lines in syslog that contain dates without newlines before them, or newlines without dates after them?

furthermore, all the linebreak/linemerge settings should be in props.conf, not transforms.conf.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

your config settings (if they're moved to be based on the source port in props.conf) then should work for any kind of syslog message where the time format matches.

0 Karma

EricPartington
Communicator

Thanks for all the feedback everyone. The reason that I am using a transform for setting the sourcetype is that I have many devices all sending syslog to the same udp port. I am trying to utilize existing firewall rules so extra ports are not possible at the moment.
Thanks for the note about the transforms hint and not processing props.conf. That helps explain other things i was seeing.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Oh right yes I missed that. Setting sourcetype at index time via TRANSFORMS does not select rules in props based on the new sourcetype at index time, it only affects search time. Sorry. Your settings would have to be based on, e.g., [source::udp:514], i.e., just move the rules under there.

0 Karma

Lowell
Super Champion

Gkanapathy, Did you note the TRANSFORMS-SOURCEA = SOURCEA_sourcetype entry. Doesn't renaming the sourcetype via transformer prevent the [SOURCEA] stanza from being used a index-time? I wrote an answer noting that point, but I would appreciate it if you would confirm that point.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

your original config settings, i mean.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

okay then your config will work right, as long as it's in the right file.

0 Karma

EricPartington
Communicator

This specific device has 90% of the syslog messages without linebreaks in them but there are a few instances where a message will contain debug information that is formatted with \n. tcpdump on the source side shows the message (including linebreaks) in one syslog entry. I would like to break these messages only when a line starts with the date followed by host, that should ensure that the debug information is kept with the rest of the syslog message.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...