Hi,
I am new to hunk.I have integrated hunk 6.2 with HDP 2.1.I am trying to do search on CSV using Hunk which are exported using sqoop?
My problem is after defining custom fields for a CSV file, those are not showing when I do searching in Hunk.
Headers already added in props.conf
[csv-emp]
FIELD_NAMES = versionno,id,empid,createdby,updatedby,createddate,updateddate
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Custom
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled = false
pulldown_type = true
search query :
index="etms" source="/apps/sqoop/db/employee_address/part-m-00000" sourcetype="csv-emp"
what I need to do for filtering by custom fields in search time?
INDEXED_EXTRACTIONS are not supported in Hunk. However Hunk is able to automatically recognize structured data files, especially csv - in this case it is failing because of the lack of the file extension. If the files contain headers (ie first line == header) you can do the following
[vix]
...
vix.input.1.recordreader = com.splunk.mr.input.SimpleCSVRecordReader
vix.input.1.recordreader.csv.regex = /part-m-\d+$
If the files do not contain headers (as they seem to be the output of MR job) you should use delimiter based KV extraction:
.../local/props.conf
[csv-emp]
REPORT-emp-fields = emp-fields
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Custom
description = Comma-separated value format. Set header and other settings in "Delimited Settings"
pulldown_type = true
.../local/transforms.conf
[emp-fields]
DELIMS = ","
FIELDS = versionno,id,empid,createdby,updatedby,createddate,updateddate
Thanks It works.But I had to define some additional changes.
I mentioned source for the file.
[source::/apps/sqoop/db/employee_address/part-m-00000]
sourcetype = csv-emp
The other problem is the fields in searching result could not recognize the datatypes.How can I define data type in the source type?
The other problem is the fields in searching result could not recognize
the datatypes.How can I define data type in the source type?
In Hunk/Splunk you don't need to define datatypes, they're automatically detected/converted at operation time. Are you running into any specific issues?