Using Splunk to analyze bro network transaction data in JSON format. I noticed the stats command and field summary stats would show a count of 2 for unique session ID's, although search results only show one event. After a lot of verification I'm certain my event source does not contain duplicate events.
Thanks to this post: https://answers.splunk.com/answers/223095/why-is-my-sourcetype-configuration-for-json-events.html, I started messing with my JSON settings in props.conf. I thought this would be my fix, but I found the opposite scenario to be true for me...
In short, I'm seeing that using index-time JSON field extractions are resulting in duplicate field values, where search-time JSON field extractions are not.
In props.conf, this produces duplicate values, visible in stats command and field summaries:
INDEXED_EXTRACTIONS=JSON
KV_MODE=none
AUTO_KV_JSON=false
If I disable indexed extractions and use search-time extractions instead, no more duplicate field values:
#INDEXED_EXTRACTIONS=JSON
KV_MODE=json
AUTO_KV_JSON=true
From what I can tell this behavior is different than what others reported in earlier posts. I'm running Splunk 6.6.2 Enterprise on a Debian VM and a 6.6.2 Universal Forwarder on another VM. Maybe there is a deployment client configuration I have wrong somewhere that is causing weird behavior for index-time extractions but no luck so far.
Using search-time extractions seems to work fine, but wondering if anyone is seeing this or if there are any ideas on root cause.
Thanks.
... View more