Splunk Add-on for Amazon Web Services: How to get ...

jpvlsmv · ‎02-05-2015

I'm having trouble getting a CSV file that I've stored in Amazon S3 to properly split at index-time.

I'm using the Splunk Add-on for AWS, which allows me to define an S3 bucket to monitor. It pulls the data down just fine when a new CSV is uploaded:

[aws_s3://s3_autoruns]
disabled = false
aws_account = Splunk Reader
bucket_name = mybucket
index = jm
initial_scan_datetime = default
interval = 30
max_items = 100000
max_retries = 10
recursion_depth = 3
sourcetype = s3_autoruns
whitelist = .*/autoruns.txt$
blacklist = .*
character_set = UTF-16LE

I have in my props.conf a working transform (which changes the Host field to part of the S3 url), so I know this stanza is hitting for this data.

[source::.../autoruns.txt]
TRANSFORMS-s3host = transform-s3-integhost
DATETIME_CONFIG=CURRENT

With this, I get an event per line of the file.

I think I should be able to add to my props.conf:

INDEXED_EXTRACTIONS=CSV
FIELD_NAMES=Time,EntryLocation,Entry,Enabled,Category,Description,Publisher,ImagePath,LaunchString,MD5,SHA-1,SHA-256
FIELD_DELIMITER=,

But when I do that, it does not change anything. I still get one event per line, and no EntryLocation field to search on.

Any thoughts?

Thanks,
--Joe

dmaislin_splunk · ‎02-06-2015

I have run into this similar issue when streaming data via scripted input into Splunk. In the interim, please use the DELIMS option for search time field extractions:

http://docs.splunk.com/Documentation/Splunk/6.2.1/Admin/transformsconf

jpvlsmv · ‎02-05-2015

If I mirror the S3 bucket to a local directory and monitor it, it splits nicely:
[monitor:///data]
disabled = 0
crcSalt = <SOURCE>
index = jm
sourcetype = s3_autoruns
whitelist = .*/autoruns.txt$

--Joe

Splunk Add-on for Amazon Web Services: How to get a CSV file stored in Amazon S3 to properly split at index-time?

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!