Deployment Architecture

Applying Quarantine .... Removing quarantine

abhayneilam
Contributor

Hi,
I have a UF in which Splunk_TA_nix application is installed and it was working fine but suddenly it started giving these errors in splunkd.log which causes discountinuty of sending the data to the Indexers.

02-03-2015 12:05:39.119 +0100 INFO  TailingProcessor - Could not send data to output queue (parsingQueue), retrying...
02-03-2015 12:05:50.632 +0100 WARN  TcpOutputProc - Cooked connection to ip=XXXXXXXXXXX:9997 timed out
02-03-2015 12:05:56.872 +0100 INFO  ExecProcessor - Ran script: /opt/SP/apps/splunkforwarder/Splunkforwarder-5.0/etc/apps/Splunk_TA_nix/bin/ps.sh, took 74.28 milliseconds to run, 11510 bytes read
02-03-2015 12:05:57.594 +0100 INFO  ExecProcessor - Ran script: /opt/SP/apps/splunkforwarder/Splunkforwarder-5.0/etc/apps/Splunk_TA_nix/bin/cpu.sh, took 1043.8 milliseconds to run, 1003 bytes read
02-03-2015 12:06:00.636 +0100 WARN  TcpOutputFd - Connect to XXXXXX:9997 failed. Connection refused
02-03-2015 12:06:00.636 +0100 ERROR TcpOutputFd - Connection to host=XXXXXXXXXXX:9997 failed
02-03-2015 12:06:00.636 +0100 WARN  TcpOutputProc - Applying quarantine to ip=XXXXXXXX port=9997 _numberOfFailures=2

I have an outputs.conf which is as follows :

[tcpout]
defaultGroup = default-autolb-group

[tcpout:default-autolb-group]
server = AABBCCDD:9997
useACK=true
sendCookedData = true


[tcpout-server://AABBCCDD:9997]

Note : AABBCCDD is the load balancer server ip

Data is appearing in the dashboard but not in the continous manner, it is missing for last 3 hours , sometimes it is missing for last 30 mins. Please HELP !!

Cheers,

Tags (1)

MuS
SplunkTrust
SplunkTrust

Hi abhayneilam,

usually there was a change somewhere, If something suddenly stops working.
Check this load-balancer or the Server OS, because Connection refused means that the target machine actively rejected the connection.
Check any fire wall in between, also consider routing settings.
Don't forget to check if splunkd is running on the indexers....

May I ask, why are you not using the universal forwarders internal load-balancing method?

cheers, MuS

abhayneilam
Contributor

We have this configured from the past 1 year, so all fine , no issues until yesterday, suddenly I dont know, Quantine issue appears :

02-03-2015 09:47:19.246 +0100 INFO TcpOutputProc - Removing quarantine from idx=XXXXX:9997
02-03-2015 09:47:39.247 +0100 WARN TcpOutputProc - Cooked connection to ip=XXXXX:9997 timed out
02-03-2015 09:47:39.248 +0100 WARN TcpOutputProc - Cooked connection to ip=XXXXX:9997 timed out
02-03-2015 09:47:39.248 +0100 WARN TcpOutputProc - Cooked connection to ip=XXXXX:9997 timed out
02-03-2015 09:47:39.248 +0100 WARN TcpOutputProc - Applying quarantine to ip=XXXXX port=9997 _numberOfFailures=2

0 Karma

MuS
SplunkTrust
SplunkTrust

so what did change yesterday? You're obviously no longer able to connect to port 9997 on IP XXXXX

0 Karma

abhayneilam
Contributor

Nothing was changed !! that's why it is strange , suddenly it happened.

0 Karma

mmensch
Path Finder

Did you ever resolve this issue? If so, how?

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...