Solved: Intermediate forwarder problems / timestamping / ...

pl123 · ‎01-31-2011

Hi we have a rather complicated setup, part of which uses an intermediate forwarder (full wf) to pass events from a series of light weight forwarders to a central index.

We would like to conduct all our transformations and data processing on the indexer, with the intermediate forwarder acting as nothing more than just a relay point for the light weight forwarders which point to it.

The problems I've been facing are rather strange. originally I noticed that splunk was not doing any transformations, field extractions or time stamping on the data that was coming through the intermediate forwarder. I am assuming that as it passes through the forwarder the data get's "cooked" so when it arrives at the indexer, the indexer assumes it does not need to do anything other than add it to the index.

So I changed the outputs.conf on the intermediate forwarder to send raw data.

[tcpout]
defaultGroup = blahblah
disabled = false

[tcpout:blahblah]
server = indexer:9996
sendCookedData = false

Hoping it would spur the indexer into doing the data processing. However this just lead to splunkd crashing on the intermediate forwarder every time it tried to connect to the indexer after it spewed these messages below a few hundred times (crash report here).

Taken from the intermediate forwarder,

01-31-2011 01:52:59.846 WARN  TcpOutputProc - TcpSendThread: Connection to server 10.137.7.20:9996, fd:23 lost - retrying: Broken pipe
01-31-2011 01:52:59.846 INFO  TcpOutputProc - attempting to connect to indexer:9996...

Taken from the indexer

<Monday Januar<90>^Yó^C! from hostname=intermediateforwarder, ip=XXXXXXXXXX, port=34447
01-31-2011 08:47:51.676 INFO  TcpInputProc - Hostname=intermediateforwarder closed connection
01-31-2011 08:47:52.444 ERROR TcpInputProc - Received unrecognized signature

This lead me to reverting back to sending cooked data, which worked, (ie splunk did not crash and the data makes it through, but unanalysed),

current outputs.conf on the intermediate forwarder.

[tcpout]
defaultGroup = blahblah
disabled = false

[tcpout:blahblah]
server = indexer:9996

Then I went down the route of trying to force the indexer to reanaylze the data, by adding route=... to inputs .conf as below.

[default]

host = indexer

[splunktcp://9996] route=has_key:_utf8:parsingQueue;has_key:_linebreaker:parsingQueue;absent_key:_utf8:parsingQueue;absent_key:_linebreaker:parsingQueue;

This nearly works, all the data now gets transformed and has its fields extracted but, about 40% of the data is not being correctly timestamped (ie exactly X hours in the future). Its strange because the date format is exactly the same, same sourcetypes, and even the same sources of data are appearing to be correctly indexed and indexed in the future at random.

I tried disabling all timestamp extraction on the indexer and just logging the events as they came in, however this still resulted in "future events".

As such I am completely out of ideas on what my next move should be.

I know the obvious solution would be to do all the transforms on the intermediate forwarder, but this is something we would like to avoid doing.

If you have ideas any I'd be glad to hear them.

Thanks

ps our enviroment is a mix of solaris and linux (red hat), all splunk instances are 4.1.6.

ftk · ‎01-31-2011

I know this is likely not what you want to hear, but I suggest just putting your props.conf and transforms.conf entries for your extractions etc onto your intermediate forwarder.

View solution in original post

gkanapathy · ‎02-01-2011

What you want is to just set up your intermediate forwarder as a LWF, instead of a heavy forwarder. The most significant difference between light and heavy forwarders is precisely that a heavy does the parsing of the data.

There are two things you have to modify about the base LWF installation, however:

Re-enable Splunk TCP input so you can receive forwarded data. In default-mode.conf:
```
[pipeline:tcp]
disabled = false
```
Disable or increase the throughput throttle. In limits.conf:
```
[thruput]
maxKBps = 0
```

This will give you exactly what you're asking for, a forwarder that forwards data without parsing it.

View solution in original post

gkanapathy · ‎02-01-2011

What you want is to just set up your intermediate forwarder as a LWF, instead of a heavy forwarder. The most significant difference between light and heavy forwarders is precisely that a heavy does the parsing of the data.

There are two things you have to modify about the base LWF installation, however:

Re-enable Splunk TCP input so you can receive forwarded data. In default-mode.conf:
```
[pipeline:tcp]
disabled = false
```
Disable or increase the throughput throttle. In limits.conf:
```
[thruput]
maxKBps = 0
```

This will give you exactly what you're asking for, a forwarder that forwards data without parsing it.

pl123 · ‎02-01-2011

Cheers, looks like this is what we were after!

ftk · ‎01-31-2011

I know this is likely not what you want to hear, but I suggest just putting your props.conf and transforms.conf entries for your extractions etc onto your intermediate forwarder.

pl123 · ‎02-01-2011

Thats what we have ended up doing, a bit annoying but it works.

Intermediate forwarder problems / timestamping / transforms

Welcome to the Splunk Community!

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Adoption of RUM and APM at Splunk