Solved: Receiving x's & 0's.............sometimes

carmackd · ‎04-20-2011

I have a script that queries a database and outputs the results to a csv file. When the file is finished being written to, the csv file is moved to a monitored directory and eaten by a universal forwarder. This process happens twice a day, once at 1200 and again at 2400. The first file (at 1200), is received by our indexers fine. The second file however always looks similar to this:

\x002\x000\x001\x001\x00-\x000\x004\x00-\x001\x009\x00 \x001\x002\x00:\x000\x000\x00:\x000\x001\x00.\x009\x001\x007\x000\x000\x000\x000\x000\x000\x00,\x00F\x00i\x00l\x00e\x00 \x00C\x00o\x00p\x00y\x00,\x005\x000\x001\x003\x000\x004\x006\x006\x003\x00,\x00c\x00f\x00s\x00l\x00o\x00u\x0

Process is all the same, script, database, directories, query, etc. The only thing that is different is the time that the script is executed. Thoughts?

vbumgarner · ‎04-20-2011

That's what Splunk does with characters that are outside of the charset. The default charset is utf-8. I'm guessing the files are not utf-8 and sometimes it's okay, and sometimes it isn't?

You can try other charsets in props.conf. The most common I've seen outside utf-8 is UTF-16LE.

http://www.splunk.com/base/Documentation/4.2.1/Data/Configurecharactersetencoding

View solution in original post

vbumgarner · ‎04-20-2011

That's what Splunk does with characters that are outside of the charset. The default charset is utf-8. I'm guessing the files are not utf-8 and sometimes it's okay, and sometimes it isn't?

You can try other charsets in props.conf. The most common I've seen outside utf-8 is UTF-16LE.

http://www.splunk.com/base/Documentation/4.2.1/Data/Configurecharactersetencoding

Receiving x's & 0's.............sometimes

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!