When I use the Splunk's Search & Reporting screen, it does not list any of the Interesting fields that are in the csv files it indexed.
I added a Hadoop Connect input and is configured as:
Resource name: 192.168.56.102:9000/WeatherStationInfo/
White list regex: *.txt
Set the source type: Manual
Source type: weatherInfo
Host field value: splunk
Index: weather
/opt/splunk/etc/system/local/props.conf contains:
[weatherInfo]
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%d %H:%M:%S
pulldown_type = 1
One of the files Splunk indexed was:
Time,TemperatureF,DewpointF,PressureIn,WindDirection,WindDirectionDegrees,WindSpeedMPH,WindSpeedGustMPH,Humidity,HourlyPrecipIn,Conditions,Clouds,dailyrainin,SolarRadiationWatts/m^2,SoftwareType,DateUTC
2014-01-01 00:00:00,45.2,24.4,30.16,ENE,71,1.0,4.0,44,0.00,,,0.00,0.0,WUHU216DAVISVP2,2014-01-01 08:00:00,
2014-01-01 00:05:00,45.3,24.0,30.16,ENE,65,2.0,4.0,43,0.00,,,0.00,0.0,WUHU216DAVISVP2,2014-01-01 08:05:00,
2014-01-01 00:10:00,45.6,24.2,30.16,ENE,65,2.0,4.0,43,0.00,,,0.00,0.0,WUHU216DAVISVP2,2014-01-01 08:10:00,
...
The fields listed on the first line, such as TemperatureF, are not listed as one of the Interesting fields and I cannot use them to search. Where am I doing wrong?
Thanks, Bill.
As you pointed out, the CSV header extractions is successful when indexing this file directly with weatherInfo sourcetype.
Did you check proper inputs.conf stanza for Hadoop Connect input?
I am still working on this problem. I find that Splunk does process the headers if Splunk imports the files locally. But if the file is from HDFS, the headers are not processed. Is this what I am to expect?
I just checked the Hadoop inputs.conf file and its contents all looks accurate including the "sourcetype". The file only contains the following:
[hdfs://192.168.56.102:9000/WeatherStationInfo/]
index = weather
sourcetype = weatherInfo
whitelist = *.txt
Try adding CHECK_FOR_HEADER = true
to your props.conf stanza.
Tried adding header. It is still not working. Anybody have any answer for this?
I just made Splunk index only one of the files that I made local to Splunk using the same settings I used for the other source and Splunk successfully provided the list of fields in the csv file. Could this be a problem when using Hadoop?
My next test will be to index the files found in a local directory to Splunk.
I did as you suggested but it is still failing to list any of the fields in the csv files. The /opt/splunk/etc/system/local/props.conf now contains:
[weatherInfo]
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%d %H:%M:%S pulldown_type = 1
CHECK_FOR_HEADERS = true
To force Splunk to re-index the files, I cleaned all Splunk events by executing the "./splunk clean eventdata" command. I then renamed all the files in the 192.168.56.102:9000/WeatherStationInfo/ directory so that Splunk will reprocess the files.