Getting Data In

Splunk fails to monitor zip file

sdwilkerson
Contributor

Hello,

Trying to have Splunk monitor standard scan-reports from Foundstone (Vulnerability Assessment Scanner), but repeatedly seeing this in the splunkd.log:

11-22-2011 17:13:26.759 -0500 ERROR ArchiveFile - In archive '/data/splunk/splunk-4.2.4/var/spool/splunk/Monthly-Full-2010-102811.csv.zip': Bad ZIP file

This zip file opens fine on the windows system with the built-in zip, and on linux with "unzip."

  • Any ideas what is causing the problem?
  • Is it possible that Foundstone uses a compression algorithm that Splunk doesn't understand and if so, how can we test for this?
  • Any idea on how to get around it besides a scripted input?

Thanks,
Sean

0 Karma
1 Solution

sdwilkerson
Contributor

Answering my own question.

The problem we found with Foundstone, is that it saves the CSV report in a hierarchical directory structure with windows style backslash characters to note new directories. This is normally ok, but I believe that the Foundstone zipping function inserts the first directory in some strange way where Linux/python interpret it as a regular backslash character and not a directory.

You can see with the linux unzip command the file is not corrupt, but the resulting contents look funny:

sean@ubuntu:/tmp/temp$ unzip -lvt Monthly-Full-2010-102811.csv.zip 
Archive:  Monthly-Full-2010-102811.csv.zip
    testing: 18\CSV/en/authenticated_hosts.csv   OK
    testing: 18\CSV/en/csvmanifest.xml   OK
    testing: 18\CSV/en/network_assets.csv   OK
    testing: 18\CSV/en/vulnerabilities.csv   OK
No errors detected in compressed data of Monthly-Full-2010-102811.csv.zip.

I believe that Splunk's monitoring process is doing some input validation and getting stuck on this backslash character.

The way I found to get around this issue, is to write a small wrapper to unzip the file in advance then have Splunk eat the files inside.

I found no output options in the Foundstone management UI that could control this behavior.

Best,

Sean

View solution in original post

sdwilkerson
Contributor

With Foundstone or some other application?

0 Karma

sdwilkerson
Contributor

Answering my own question.

The problem we found with Foundstone, is that it saves the CSV report in a hierarchical directory structure with windows style backslash characters to note new directories. This is normally ok, but I believe that the Foundstone zipping function inserts the first directory in some strange way where Linux/python interpret it as a regular backslash character and not a directory.

You can see with the linux unzip command the file is not corrupt, but the resulting contents look funny:

sean@ubuntu:/tmp/temp$ unzip -lvt Monthly-Full-2010-102811.csv.zip 
Archive:  Monthly-Full-2010-102811.csv.zip
    testing: 18\CSV/en/authenticated_hosts.csv   OK
    testing: 18\CSV/en/csvmanifest.xml   OK
    testing: 18\CSV/en/network_assets.csv   OK
    testing: 18\CSV/en/vulnerabilities.csv   OK
No errors detected in compressed data of Monthly-Full-2010-102811.csv.zip.

I believe that Splunk's monitoring process is doing some input validation and getting stuck on this backslash character.

The way I found to get around this issue, is to write a small wrapper to unzip the file in advance then have Splunk eat the files inside.

I found no output options in the Foundstone management UI that could control this behavior.

Best,

Sean

hartfoml
Motivator

Great this is exactly what I needed. If it's not too much trouble can you post the unzip code you used. Thanks ever so much. I am using Founstone too and want to get the scan data directly without the operator having to uncompress the reports.

0 Karma

hartfoml
Motivator

I am haveing the same issue. Did you ever find a salution?

0 Karma
Get Updates on the Splunk Community!

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...

Combine Multiline Logs into a Single Event with SOCK: a Step-by-Step Guide for ...

Combine multiline logs into a single event with SOCK - a step-by-step guide for newbies Olga Malita The ...

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...