All Apps and Add-ons

IPFIX v5.0.2 generates a ton of WARNING logs. Anyone else see this?

dfronck
Communicator

We had an issue with IPFIX crashing about 10 times a day. Splunk fixed that in IPFIX v5.0.2 but now we're getting 300,000 WARNING events; over 75MB of logs a minute in /opt/splunk/var/log/ipfix.log.

Is anyone else seeing this issue?

Basically, we get en error for every number that is in the appflow data. Here's a couple log lines.

2015-01-22 15:01:59,886 WARNING pid=15901 tid=MainThread file=IPFIXData.py:__init__:129 | Parsed egressInterface of type unsigned32 (4) [Id 0:14] for template 262. Data(!L): Encode '2147483651' failed because of the non-unicode data. Use 2147483651 instead.

2015-01-22 15:05:28,027 WARNING pid=15901 tid=MainThread file=IPFIXData.py:__init__:129 | Parsed netscalerFlowFlags of type unsigned64 (8) [Id 5951:132] for template 280. Data(!Q): Encode '8673961984' failed because of the non-unicode data. Use 8673961984 instead.
1 Solution

dfronck
Communicator

I've been running 5.0.3 for about a week now with no issues.

Thanks for the fix!

Hopefully you can now spend some time on adding template caching between reloads! 😉

View solution in original post

dfronck
Communicator

I've been running 5.0.3 for about a week now with no issues.

Thanks for the fix!

Hopefully you can now spend some time on adding template caching between reloads! 😉

millern4
Communicator

Also experiencing the same problem, with the previous version of IPFIX we were only able to run for about 10 minutes or so before having to restart splunkd on the heavy forwarder.

This version as stated above does correct the crash, but we have over 2 million events for the last 15 minutes. Watching this thread awaiting an updated version of the TA.

For the time being I've stopped the TA from running and deleted the logs.

0 Karma

jcoates_splunk
Splunk Employee
Splunk Employee

I've just re-enabled visibility of the 5.0.1 release -- we're pretty sure we've found the right fix and will post a new maintenance version shortly. I'm sorry for the delay.

0 Karma

millern4
Communicator

Thank you! The AppFlow data isn't mission critical to us, we are still gathering syslog (ns_log) from the NetScaler's so we are okay with waiting for a proper fix from Splunk.

Thank you for the updates.

0 Karma

jcoates_splunk
Splunk Employee
Splunk Employee

We've just released version 5.0.3, which should resolve the issues found. Sorry for the delay.

tela79
New Member

Has there been any development on this?
We were thinking of going from free to a paid version of Splunk but since NS is our main source of data at the moment we've put those plans on hold until the integration actually works.

0 Karma

jcoates_splunk
Splunk Employee
Splunk Employee

We've just released version 5.0.3, which should resolve the issues found. Sorry for the delay.

0 Karma

dfronck
Communicator

Just a note that this is dropping about 80% of the logs so I'm rolling back to the one that crashes 10 times a day!

0 Karma

dfronck
Communicator

I'm testing a 5.0.3 beta of this and it hasn't crashed since I installed it; about a day. No strange errors and it doesn't seem to be dropping any logs. It does still log to ipfix.log AND splunkd.log but there is a logging.conf.sample in defaults that might be able to correct that.

I've provided support with this same info. Hopefully it'll be released soon.

jcoates_splunk
Splunk Employee
Splunk Employee

Thanks dfronck! Got your report from support, we're going to get this out ASAP.

jcoates_splunk
Splunk Employee
Splunk Employee

5.0.3 is released, please let us know if that fixes it. Sorry for the delay.

0 Karma

jcoates_splunk
Splunk Employee
Splunk Employee

Instead of crashing it logs that the source tried to crash it. Your gear is still sending non-compliant data. It should stop doing that.

We use standard Python logging, so you can decrease the logging level. That will mean we are silently deleting messages instead of logging deletion, which is why we don't do it by default.

0 Karma

YotaLab
Engager

Hi there!

I have the same problem. Where i can download ipfix addon 5.0.1 if it works?

0 Karma

dfronck
Communicator

So basically you're saying that ALL NetScalers running v10 are broken?

Only 1 of the templates is data from our apps. The rest is just NetScaler Built-In performance metric crap like Round Trip Times and Response Status. If that's broken for us, I'd assume that it's broken for everyone. I'll try to compare what we're getting in Splunk to what the NetScalers are sending once the NetScaler guy gets back but it appears that what's making it into Splunk is valid.

Also, we were only crashing about 10 times a day, not 300,000 times a minute!

0 Karma

jbennett_splunk
Splunk Employee
Splunk Employee

Is data being thrown out when you get those warnings, or is it still being logged?

0 Karma

dfronck
Communicator

It's being thrown out.

0 Karma

jcoates_splunk
Splunk Employee
Splunk Employee

@dfronck, support should have a build for you to try.

0 Karma

jcoates_splunk
Splunk Employee
Splunk Employee

The RFC says non-Unicode string characters are not ok. Python says the data isn't Unicode, and crashes when we parse it as Unicode. I'm open to understanding more about how we could process that scenario in a better way.

0 Karma

dfronck
Communicator

OK, I think I'm missing something. These are supposed to be ints and floats not strings.

They're defined in the XML as "unsigned64" and "unsigned32" and "unsigned8" and "dateTimeMilliseconds" and "dateTimeMicroseconds" which I assumed were integers or floats.

The warning message says that netscalerFlowFlags should be an unsigned64.
2015-01-22 15:05:28,027 WARNING pid=15901 tid=MainThread file=IPFIXData.py:init:129 | Parsed netscalerFlowFlags of type unsigned64 (8) [Id 5951:132] for template 280. Data(!Q): Encode '8673961984' failed because of the non-unicode data. Use 8673961984 instead.

Are you saying that the NetScaler should be passing these numbers as Unicoded strings?

jcoates_splunk
Splunk Employee
Splunk Employee

dfronck, thanks for pointing that out. Do you have a support case open?

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...