I have a text file that I cannot index, I KNOW it's text, I can vi the file with :set list and there are no hidden characters or formatting. In fact I deleted the file and created it with the name "wtmp" from scratch and STILL I cannot index this file as Splunk claims it is binary! What is going on?
Splunk will not index binary files. There are certain files and filetypes that are by default considered binary by Splunk.
Notice in $SPLUNK_HOME/etc/system/default/props.conf
[source::....(0t|a|ali|asa|au|bmp|cg|cgi|class|d|dat|deb|del|dot|dvi|dylib|elc|eps|exe|ftn|gif|hlp|hqx|hs|icns|ico|inc|iso|jame|jin|jpeg|jpg|kml|la|lhs|lib|lo|lock|mcp|mid|mp3|mpg|msf|nib|o|obj|odt|ogg|ook|opt|os|pal|pbm|pdf|pem|pgm|plo|png|po|pod|pp|ppd|ppm|ppt|prc|ps|psd|psym|pyc|pyd|rast|rb|rde|rdf|rdr|rgb|ro|rpm|rsrc|so|ss|stg|strings|tdt|tif|tiff|tk|uue|vhd|xbm|xlb|xls|xlw)]
sourcetype = known_binary
[lastlog]
invalid_cause = binary
LEARN_MODEL = false
[wtmp]
invalid_cause = binary
LEARN_MODEL = false
[known_binary]
is_valid = False
invalid_cause = binary
LEARN_MODEL = false
Any files with the extensions listed in the source stanza, or files named wtmp or lastlog will not be indexed as Splunk considers them Binary files. If you have a log with these reserved names, change the name and your log should be indexed.
If you are working with trying to index wtmp itself, there is a great post here to help you accomplish that.
http://splunk-base.splunk.com/answers/5844/can-i-splunk-my-wtmp-files
Splunk will not index binary files. There are certain files and filetypes that are by default considered binary by Splunk.
Notice in $SPLUNK_HOME/etc/system/default/props.conf
[source::....(0t|a|ali|asa|au|bmp|cg|cgi|class|d|dat|deb|del|dot|dvi|dylib|elc|eps|exe|ftn|gif|hlp|hqx|hs|icns|ico|inc|iso|jame|jin|jpeg|jpg|kml|la|lhs|lib|lo|lock|mcp|mid|mp3|mpg|msf|nib|o|obj|odt|ogg|ook|opt|os|pal|pbm|pdf|pem|pgm|plo|png|po|pod|pp|ppd|ppm|ppt|prc|ps|psd|psym|pyc|pyd|rast|rb|rde|rdf|rdr|rgb|ro|rpm|rsrc|so|ss|stg|strings|tdt|tif|tiff|tk|uue|vhd|xbm|xlb|xls|xlw)]
sourcetype = known_binary
[lastlog]
invalid_cause = binary
LEARN_MODEL = false
[wtmp]
invalid_cause = binary
LEARN_MODEL = false
[known_binary]
is_valid = False
invalid_cause = binary
LEARN_MODEL = false
Any files with the extensions listed in the source stanza, or files named wtmp or lastlog will not be indexed as Splunk considers them Binary files. If you have a log with these reserved names, change the name and your log should be indexed.
If you are working with trying to index wtmp itself, there is a great post here to help you accomplish that.
http://splunk-base.splunk.com/answers/5844/can-i-splunk-my-wtmp-files