Getting Data In

How can I remove a partial string of a single line event and keep the rest by transforms.conf?

Masa
Splunk Employee
Splunk Employee

How can I remove partial string of single line event and keep the rest by transforms.conf?
(Note: Originally I mistakenly said keep only 6k bytes. Sorry for the confusion)

I have syslog type of data. They are single line and sometimes more than 64k Byte long.
I do not need first Timestamp and host strings because that part was added by a syslog server.
I would like to keep the rest.
So, I created the following transforms.conf, but it does not work.
I know SEDCMD works to do the same job.
But, why does transforms.conf not work?

  • props.conf

    [syslog-cef]
    SHOULD_LINEMERGE = false
    TRANSFORMS-keep6k = removeHeader_keeprest

  • transforms.conf

    [removeHeader_keeprest]
    REGEX = ^\w{3}\s+\d{1,2}\s+(?:\d{2}:){2}:\d{2}\s+[\w.]+\s(.+)
    DEST_KEY = _raw
    FORMAT = $1
    Only 4052 bytes of an event was indexed.

0 Karma
1 Solution

MuS
SplunkTrust
SplunkTrust

Hi Masa,

Did you check the LOOKAHEAD = option in transforms.conf ?
From the docs http://docs.splunk.com/Documentation/Splunk/6.2.5/Admin/Transformsconf :

LOOKAHEAD = <integer>
* NOTE: This option is valid for all index time transforms, such as index-time
  field creation, or DEST_KEY modifications.
* Optional. Specifies how many characters to search into an event.
* Defaults to 4096. You may want to increase this value if you have event line lengths that 
  exceed 4096 characters (before linebreaking).

cheers, MuS

PS: Thanks for this amazing wiki http://wiki.splunk.com/Community:Test:How_Splunk_behaves_when_receiving_or_forwarding_udp_data !

View solution in original post

MuS
SplunkTrust
SplunkTrust

Hi Masa,

Did you check the LOOKAHEAD = option in transforms.conf ?
From the docs http://docs.splunk.com/Documentation/Splunk/6.2.5/Admin/Transformsconf :

LOOKAHEAD = <integer>
* NOTE: This option is valid for all index time transforms, such as index-time
  field creation, or DEST_KEY modifications.
* Optional. Specifies how many characters to search into an event.
* Defaults to 4096. You may want to increase this value if you have event line lengths that 
  exceed 4096 characters (before linebreaking).

cheers, MuS

PS: Thanks for this amazing wiki http://wiki.splunk.com/Community:Test:How_Splunk_behaves_when_receiving_or_forwarding_udp_data !

Masa
Splunk Employee
Splunk Employee

Thanks, MuS !

Yes, that's what we needed! I should have read the spec file...
Because the regex (.+) parsed as many as characters in the event after removing the first part, it ended up with 4052 character length.

Agin, the default capture length is 4096. And my regex removed 44 characters from the beginning of the line. As a result, only 4052 characters of the event was indexed.

This attribute is important when _raw data or field length is longer than 4k.

P.S. Thanks for recognizing the wiki document. Splunk doc team is planning to add refined and concise version to our official doc. The wiki doc is very verbose and will not fit in our official doc in a way 🙂

0 Karma

woodcock
Esteemed Legend

Your capture group should be (.{0,6144}).

0 Karma

Masa
Splunk Employee
Splunk Employee

woodcok. Thank you for pointing it out. It was my mistake. I was not supposed to say to keep only 6k.
Otherwise, yes, you're right about the regex when I want to keep any characters up to 6k.
Sorry for the confusion.
By the way, even if I do the regex you suggested to keep up to 6k, it will not parse 6k when an event is larger than 6k.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...