Splunk Search

Intermediate forwarder problems / timestamping / transforms

pl123
Path Finder

Hi we have a rather complicated setup, part of which uses an intermediate forwarder (full wf) to pass events from a series of light weight forwarders to a central index.

We would like to conduct all our transformations and data processing on the indexer, with the intermediate forwarder acting as nothing more than just a relay point for the light weight forwarders which point to it.

The problems I've been facing are rather strange. originally I noticed that splunk was not doing any transformations, field extractions or time stamping on the data that was coming through the intermediate forwarder. I am assuming that as it passes through the forwarder the data get's "cooked" so when it arrives at the indexer, the indexer assumes it does not need to do anything other than add it to the index.

So I changed the outputs.conf on the intermediate forwarder to send raw data.

[tcpout]
defaultGroup = blahblah
disabled = false

[tcpout:blahblah]
server = indexer:9996
sendCookedData = false

Hoping it would spur the indexer into doing the data processing. However this just lead to splunkd crashing on the intermediate forwarder every time it tried to connect to the indexer after it spewed these messages below a few hundred times (crash report here).

Taken from the intermediate forwarder,

01-31-2011 01:52:59.846 WARN  TcpOutputProc - TcpSendThread: Connection to server 10.137.7.20:9996, fd:23 lost - retrying: Broken pipe
01-31-2011 01:52:59.846 INFO  TcpOutputProc - attempting to connect to indexer:9996...

Taken from the indexer

<Monday Januar<90>^Yó^C! from hostname=intermediateforwarder, ip=XXXXXXXXXX, port=34447
01-31-2011 08:47:51.676 INFO  TcpInputProc - Hostname=intermediateforwarder closed connection
01-31-2011 08:47:52.444 ERROR TcpInputProc - Received unrecognized signature 

This lead me to reverting back to sending cooked data, which worked, (ie splunk did not crash and the data makes it through, but unanalysed),

current outputs.conf on the intermediate forwarder.

[tcpout]
defaultGroup = blahblah
disabled = false

[tcpout:blahblah]
server = indexer:9996

Then I went down the route of trying to force the indexer to reanaylze the data, by adding route=... to inputs .conf as below.

[default]

host = indexer

[splunktcp://9996] route=has_key:_utf8:parsingQueue;has_key:_linebreaker:parsingQueue;absent_key:_utf8:parsingQueue;absent_key:_linebreaker:parsingQueue;

This nearly works, all the data now gets transformed and has its fields extracted but, about 40% of the data is not being correctly timestamped (ie exactly X hours in the future). Its strange because the date format is exactly the same, same sourcetypes, and even the same sources of data are appearing to be correctly indexed and indexed in the future at random.

I tried disabling all timestamp extraction on the indexer and just logging the events as they came in, however this still resulted in "future events".

As such I am completely out of ideas on what my next move should be.

I know the obvious solution would be to do all the transforms on the intermediate forwarder, but this is something we would like to avoid doing.

If you have ideas any I'd be glad to hear them.

Thanks

ps our enviroment is a mix of solaris and linux (red hat), all splunk instances are 4.1.6.

2 Solutions

ftk
Motivator

I know this is likely not what you want to hear, but I suggest just putting your props.conf and transforms.conf entries for your extractions etc onto your intermediate forwarder.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

What you want is to just set up your intermediate forwarder as a LWF, instead of a heavy forwarder. The most significant difference between light and heavy forwarders is precisely that a heavy does the parsing of the data.

There are two things you have to modify about the base LWF installation, however:

  • Re-enable Splunk TCP input so you can receive forwarded data. In default-mode.conf:

    [pipeline:tcp]
    disabled = false
    
  • Disable or increase the throughput throttle. In limits.conf:

    [thruput]
    maxKBps = 0
    

This will give you exactly what you're asking for, a forwarder that forwards data without parsing it.

View solution in original post

gkanapathy
Splunk Employee
Splunk Employee

What you want is to just set up your intermediate forwarder as a LWF, instead of a heavy forwarder. The most significant difference between light and heavy forwarders is precisely that a heavy does the parsing of the data.

There are two things you have to modify about the base LWF installation, however:

  • Re-enable Splunk TCP input so you can receive forwarded data. In default-mode.conf:

    [pipeline:tcp]
    disabled = false
    
  • Disable or increase the throughput throttle. In limits.conf:

    [thruput]
    maxKBps = 0
    

This will give you exactly what you're asking for, a forwarder that forwards data without parsing it.

pl123
Path Finder

Cheers, looks like this is what we were after!

0 Karma

ftk
Motivator

I know this is likely not what you want to hear, but I suggest just putting your props.conf and transforms.conf entries for your extractions etc onto your intermediate forwarder.

pl123
Path Finder

Thats what we have ended up doing, a bit annoying but it works.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...