Getting Data In

Splunk insisting on Auto-finding CSV fields?

travispowell
Path Finder

Hello,

Splunk is insisting on trying to auto-find headers in a tab-delimited CSV file for which I have manually defined headers in a CONF file. I thought that putting this information in /etc/system/local would override the /etc/apps/learned/ but it doesn't look like that's the case...

Here are my CONF files for system-local:

inputs.conf

[monitor:///logs/strauss_splunk/bulksession]
sourcetype=csv
source=strauss_sessions
index=strauss_sessions
host=WHITNEY

[monitor:///logs/strauss_splunk/bulkurl]
sourcetype=csv
source=strauss_url
index=strauss_url
host=WHITNEY

[monitor:///inetpub/strauss_splunk/bulkhit]
sourcetype=csv
source=strauss_hits
index=strauss_hits
host=WHITNEY

props.conf

[source::strauss_url]
SHOULD_LINEMERGE=false
CHECK_FOR_HEADER=false
TRANSFORMS-STRAUSSTSV=STRAUSSTSV-1

[source::strauss_sessions]
SHOULD_LINEMERGE=false
CHECK_FOR_HEADER=false
TRANSFORMS-STRAUSSTSV=STRAUSSTSV-2

[source::strauss_hits]
SHOULD_LINEMERGE=false
CHECK_FOR_HEADER=false
TRANSFORMS-STRAUSSTSV=STRAUSSTSV-3

transforms.conf

[STRAUSSTSV-3]
DELIMS = "  "
FIELDS = "SESSION_KEY", "HIT_KEY", "ID", "SECURE"

[STRAUSSTSV-2]
DELIMS = "  "
FIELDS = "SESSION_KEY", "ADDRESS", "CANISTER"

[STRAUSSTSV-1]
DELIMS = "  "
FIELDS = "SESSION_KEY", "HIT_KEY", "NAME", "VALUE", "TIMESTAMP"

* * * * * * * * * * * * * * * * * * * * * * * *

...and this all looks good, right? But... this is what the /system/learned/ CONF files populate as afterwards:

* * * * * * * * * * * * * * * * * * * * * * * *

props.conf

[csv-2]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-1
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true

[csv-3]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-2
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true

[csv-4]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-3
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true

[csv-5]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-4
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true

[csv-6]
KV_MODE = none
REPORT-AutoHeader = AutoHeader-5
SHOULD_LINEMERGE = False
given_type = csv
pulldown_type = true95

transforms.conf

[AutoHeader-1]
DELIMS = "  "
FIELDS = "58b0c3f3c517dd9ee90cf256800dae98", "27eec7a8d949b8afe03a11b47604633b", "B99004EA29E4FA783416FAC3F7AB87A5", "N"

[AutoHeader-2]
DELIMS = "  "
FIELDS = "0f4c4f0c76bb2898ccbcfa816cfbe49b", "cbb2438acf31acf8acefacb3ff2b59a9", "EA9AE62469C9E2DCE926B32A545675EA", "N"

[AutoHeader-3]
DELIMS = "  "
FIELDS = "8ed97b717ce4b20e561a9c5a033f925c", "44bdc568a0b0052fd2232239257ebc6c", "897FFA5C1C1F078517B1FF8DB392AC54", "Y"

[AutoHeader-4]
DELIMS = "  "
FIELDS = "58b0c3f3c517dd9ee90cf256800dae98", "63.194.158.158", "LSSN_20110419_WHITNEY.dat"

[AutoHeader-5]
DELIMS = "  "
FIELDS = "becf1cd6433bd8ddf2a3f4e9da3fe133", "20c184e3ce2396fbda9d5071c8b3344d", "login_username", "XXXXXXXXXXXXXXXX", "2011-04-19 07:00:20.000"

Any ideas??

Thank you so much

1 Solution

travispowell
Path Finder

Solved: I capitulated. Don't fight the beast. I ended up saying screw-it, splunk, you can auto-extract field names for me. But I wrote a REGEX rule that pointed to nullQueue to remove the first line.

See here:

http://www.splunk.com/support/forum:SplunkAdministration/4081

View solution in original post

travispowell
Path Finder

Solved: I capitulated. Don't fight the beast. I ended up saying screw-it, splunk, you can auto-extract field names for me. But I wrote a REGEX rule that pointed to nullQueue to remove the first line.

See here:

http://www.splunk.com/support/forum:SplunkAdministration/4081

JSapienza
Contributor

You might want to add this to your props.conf

http://www.splunk.com/base/Documentation/latest/admin/Propsconf

LEARN_SOURCETYPE = [true|false]

  • Determines whether learning of known or unknown sourcetypes is enabled. * For known sourcetypes, refer to LEARN_MODEL. * For unknown sourcetypes, refer to the rule:: and delayedrule:: configuration (see below).
  • Setting this field to false disables CHECK_FOR_HEADER as well (see above).
  • Defaults to true.

gkanapathy
Splunk Employee
Splunk Employee

You would need to clean out the etc/apps/learned/props.conf file, and reindex the data.

0 Karma

travispowell
Path Finder

This is a test box, because I have about 10+ GB /day of this stuff to index, (short halflife) and I'm cleaning the index every time. (> splunk clean eventdata) I got it to work with auto field extraction by inserting my own header line, but the issue there is that the header line is included in the count, and if I have 27,804 events I don't want it to say 27,805.

0 Karma

JSapienza
Contributor

I think changes at this point would only apply to new event's and not events already indexed.

0 Karma

travispowell
Path Finder

Thanks for the suggestions, but neither of those actually fixed it. The learn_sourcetype modifier did stop Splunk from trying to auto-define fields, but it didn't let my CONF files take over...

0 Karma

JSapienza
Contributor

Light bulb went off when I re-read your question. You will need to use DELIMS = "\t" for tab and not " "

0 Karma

travispowell
Path Finder

Thanks, haven't got it to work yet, but I'll keep investigating. I think I might have to change some other things around relating to the source types.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...