Getting Data In

How to configure Transforms.conf and Props.conf to break logs into individual events by timestamp?

willial
Communicator

I have a really big file that I'm trying to subdivide. It has a lot of different subsections, and one of them is called "logs" which contains any number of log messages which each starts with a timestamp (mm/dd/yyyy hh:mm:ss). Below I've listed my props and transforms stanzas.

Splunk is currently taking the sections as separated by "@@@" properly (I've left off the dozen or so other "TRANSFORMS-sourcetype" rules below), and when it gets to the logs section it breaks off the logs into one gigantic event rather than individually pulling each timestamped part as its own event. I'd prefer the latter. I've been playing around with it for awhile now and I haven't made any headway. Anyone have any tips?

Props.conf

[source::sourcename]
BREAK_ONLY_BEFORE = @@@
SHOULD_LINEMERGE = true
TRANSFORMS-sourcetype = set_log

Transforms.conf

  [set_log]
    REGEX = \d{2}/\d{2}/\d{4}\s\d{2}:\d{2}
    FORMAT = sourcetype::log
    DEST_KEY = MetaData:Sourcetype
0 Karma
1 Solution

willial
Communicator

I tried to answer this a minute ago but it didn't take. I'll try again:

I eventually solved the problem by using the following three lines in my in props.conf stanza:

BREAK_ONLY_BEFORE = \d{2}/\d{2}/\d{4}\s\d{2}:\d{2}:\d{2}.\d{2}\s<
MUST_BREAK_AFTER = @@@
SHOULD_LINEMERGE = true

The BREAK_ONLY_BEFORE is set to uniquely pick up the start of a log line and breaks each log into its own event, and the MUST_BREAK_AFTER cuts the event at the end of its section (delimited by the triple-@) which correctly sections the multi-line events. I checked the edges and it looks sound.

View solution in original post

willial
Communicator

I tried to answer this a minute ago but it didn't take. I'll try again:

I eventually solved the problem by using the following three lines in my in props.conf stanza:

BREAK_ONLY_BEFORE = \d{2}/\d{2}/\d{4}\s\d{2}:\d{2}:\d{2}.\d{2}\s<
MUST_BREAK_AFTER = @@@
SHOULD_LINEMERGE = true

The BREAK_ONLY_BEFORE is set to uniquely pick up the start of a log line and breaks each log into its own event, and the MUST_BREAK_AFTER cuts the event at the end of its section (delimited by the triple-@) which correctly sections the multi-line events. I checked the edges and it looks sound.

willial
Communicator

I should also note that where I said "Some Stuff" and "More Stuff" and so on, it could be one line or dozens. The unmodified input is typically several hundred lines, and I can't change how that comes in. I can only try to deal with it at index time or search time.

0 Karma

willial
Communicator

The input that comes in is very long. A nonspecific abridged example would be:

@@@
Some Stuff
@@@
More Stuff
@@@
01/01/2000 06:06:06 Log message here
01/01/2000 05:05:05 Other log message here
@@@
Even More Stuff

And so on. In the above, "Some Stuff" would go to one sourcetype as one event, "More Stuff" would go to another sourcetype as one event, and then the two log messages would go to the "log" sourcetype as two events. Ideally.

Now, I've seen this work before, I just don't know how to get there. If I set SHOULD_LINEMERGE to false I get the behavior I want for the logs section but not for all the others.

0 Karma

jrodman
Splunk Employee
Splunk Employee

We don't do two passes through the aggregator, so re-sepearating sections that you made into events isn't a thing that can be done. Somehow you're going to have to get them into the right events the first time around, or preprocess the data.

Transforms only operates on already established events, modifying them. It has no power to turn one event into multiple events.

somesoni2
Revered Legend

Can you provide some sample log entries?

0 Karma

aweitzman
Motivator

Does it help if you change SHOULD_LINEMERGE to false? It seems like making it true contributes to bundling lines into multi-line events, based on http://docs.splunk.com/Documentation/Splunk/6.1.3/Admin/Propsconf

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...