Getting Data In

Index large Json files

moneybox
Explorer

Hello,

In our use of Splunk we have encountered several problems in JSON indexing that caused to upgrade our Splunk servers.
We are trying to index large JSON files.
the problem that we’ve encountered is that splunk indexer isn't extracting all the fields from the JSON file,
the symptoms of our problems are:

    1. if we execute the "|table field_name" we don't get the value of the filed
    2. if we execute the "|spath field_name |table field_name" somtimes we get the value twice.
    3. we don't see the filed in the "field table" on the left to the events in the search app.

In the new version of Splunk (6.2.4) we’ve noticed that there are two options to extract the data fields from the JSON files.
The first one is INDEXED_EXTRACTIONS and the second one is KV_MODE.
I would appreciate your help to understand which way is better to improve the search time when we are looking for a value of a field in the json
as: index=index_name field_name="value"
and to solve the problems described above.

Thanks

0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

you have said that the file is large, but you haven't detailed if the event is large. That would help you determine which solution (search or index time field extractions) is more performant.
KV_MODE = JSON will extract the fields at search time automagically.

INDEXED_EXTRACTIONS=JSON does it at index time. This might be a bit faster returning data, but you will need to consider that indexed fields take up a bit of space on disk.

If you get the value twice with spath, you might want to be a bit more specific than spath fieldname as your structure is perhaps causing unintended iteration. (single value? Multivalue? repeated names?

Would help for you to show a sample event. (edit the question. please do not add it as a new question or a comment)

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...