Getting Data In

Why would INDEXED_EXTRACTIONS=JSON in props.conf be resulting in duplicate values?

pumphreyaw
Explorer

Using Splunk to analyze bro network transaction data in JSON format. I noticed the stats command and field summary stats would show a count of 2 for unique session ID's, although search results only show one event. After a lot of verification I'm certain my event source does not contain duplicate events.

Thanks to this post: https://answers.splunk.com/answers/223095/why-is-my-sourcetype-configuration-for-json-events.html, I started messing with my JSON settings in props.conf. I thought this would be my fix, but I found the opposite scenario to be true for me...

In short, I'm seeing that using index-time JSON field extractions are resulting in duplicate field values, where search-time JSON field extractions are not.

In props.conf, this produces duplicate values, visible in stats command and field summaries:

INDEXED_EXTRACTIONS=JSON
KV_MODE=none
AUTO_KV_JSON=false

If I disable indexed extractions and use search-time extractions instead, no more duplicate field values:

#INDEXED_EXTRACTIONS=JSON
KV_MODE=json
AUTO_KV_JSON=true  

From what I can tell this behavior is different than what others reported in earlier posts. I'm running Splunk 6.6.2 Enterprise on a Debian VM and a 6.6.2 Universal Forwarder on another VM. Maybe there is a deployment client configuration I have wrong somewhere that is causing weird behavior for index-time extractions but no luck so far.

Using search-time extractions seems to work fine, but wondering if anyone is seeing this or if there are any ideas on root cause.

Thanks.

1 Solution

mattymo
Splunk Employee
Splunk Employee

Hey pumphreyaw!

It comes down to WHERE you make these changes. If you use INDEXED_EXTRACTIONS, the props.conf needs to be on the UF ( Universal Forwarder VM ), and the KV_MODE=NONE needs to be on the Search Head (aka your Splunk Enterprise VM).

From what I read above, setting the INDEXED_EXTRACTIONS and disabling KV_MODE=JSON should work.

Where did you disable the KV_MODE configs?

- MattyMo

View solution in original post

mattymo
Splunk Employee
Splunk Employee

Hey pumphreyaw!

It comes down to WHERE you make these changes. If you use INDEXED_EXTRACTIONS, the props.conf needs to be on the UF ( Universal Forwarder VM ), and the KV_MODE=NONE needs to be on the Search Head (aka your Splunk Enterprise VM).

From what I read above, setting the INDEXED_EXTRACTIONS and disabling KV_MODE=JSON should work.

Where did you disable the KV_MODE configs?

- MattyMo

joesrepsolc
Communicator

Any easy to read lists exist of WHERE to use each of these options in the props.conf? I run into this from time to time and its not 100% clear to me WHERE they need to go.

Sometimes it clears says "input time" on this reference (https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Propsconf) but other times it doesn't and I'm not sure what that means then.

 

Any help would be GREAT!!!

0 Karma

mallempati
New Member

hi @mmodestino [Splunk] ♦

By removing the INDEXED_EXTRACTIONS = json from the props.conf on the UF has fixed the issue of duplicates. But it started giving another issue that is sometimes its missing few json event lines.

KV_MODE = none
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = requests.Time
category = Structured
disabled = false
pulldown_type = true

Any idea how to fix this issue.

0 Karma

jperry_intact
New Member

I cannot get this to work for the life of me. I can get the json events to only index once if I upload the file and select the sourcetype. If I set it as a monitor input for the same sourcetype and the same files, I get duplicate events. Initially I was getting duplicate events(same event listed twice) and duplicate field extractions(1 field, 2 identical values). Adding INDEXED_EXTRACTIONS = JSON seemed to fix the duplicate field extractions

Its on a single server install on my local machine and I have tried creating the props.conf entry below in both C:\Program Files\Splunk\etc\system\local and C:\Program Files\Splunk\etc\apps\INSERTAPPNAMEHERE\local and no dice.

[FishNPickles]
INDEXED_EXTRACTIONS = JSON
TIMESTAMP_FIELDS = properties.LastUpdateTime
TZ = UTC
AUTO_KV_JSON = false
DATETIME_CONFIG =
KV_MODE = none
SHOULD_LINEMERGE = false
category = Custom
description = PicklesNFish
disabled = false
pulldown_type = true

Is there some secret sauce to this I'm missing? It just straight up ignores the KV_MODE settings and is still indexing my entities twice.

Any direction you could provide would be ultra awesome and greatly appreciated!

0 Karma

jperry_intact
New Member

I have apparently done something horrible to my local install. I brought up a new host the your solution works great.

Who knows...

0 Karma

pumphreyaw
Explorer

I think you nailed it. The props.conf file I'm modifying in this case belongs to a deployment app that's getting pushed to the UF, none of which is going to the Search Head. I see I need to split these props settings up accordingly. I'll give that a try. Thanks for the help and quick reply.

0 Karma

mattymo
Splunk Employee
Splunk Employee

awesome, I have converted the comment to answer. Let me know if it works!

- MattyMo
0 Karma

pumphreyaw
Explorer

Yep, that worked perfectly. Oversight on my part, just needed to put things in the right place.

Thanks mmodestino!

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...