Getting Data In

Ingesting a Json format data in Splunk

Shashank_87
Explorer

Hi, I am trying to upload a file with json formatted data like below but it's not coming properly. I tried using 2 ways -

  1. When selecting sourcetype as automatic, it is creating a separate event for timestamp field.
  2. When selecting the sourcetype as _json, the timestamp is not even coming in the event.

Tue 21 Apr 14:16:26 BST 2020
{"items":[{"cpu.load": "0.97","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4466.046 MB","used.jvm.memory": "1573.752 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3874.03 MB","total.used.physical.system.memory": "12.782 GB","number.of.cpus": "8"}]}

Tue 21 Apr 14:16:36 BST 2020
{"items":[{"cpu.load": "0.97","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4456.382 MB","used.jvm.memory": "1583.415 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3874.439 MB","total.used.physical.system.memory": "12.782 GB","number.of.cpus": "8"}]}

Is there a way to ingest/upload this data properly?

Tue 21 Apr 14:16:26 BST 2020
{"items":[{"cpu.load": "0.97","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4466.046 MB","used.jvm.memory": "1573.752 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3874.03 MB","total.used.physical.system.memory": "12.782 GB","number.of.cpus": "8"}]}
Tue 21 Apr 14:16:36 BST 2020
{"items":[{"cpu.load": "0.97","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4456.382 MB","used.jvm.memory": "1583.415 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3874.439 MB","total.used.physical.system.memory": "12.782 GB","number.of.cpus": "8"}]}
Tue 21 Apr 14:16:46 BST 2020
{"items":[{"cpu.load": "0.84","total.jvm.memory": "6039.798 MB","free.jvm.memory": "4449.94 MB","used.jvm.memory": "1589.858 MB","total.physical.system.memory": "16.656 GB","total.free.physical.system.memory": "3867.042 MB","total.used.physical.system.memory": "12.789 GB","number.of.cpus": "8"}]}
0 Karma
1 Solution

harsmarvania57
Ultra Champion

Hi,

Your rawdata contain timestamp Tue 21 Apr 14:16:26 BST 2020 and after that you have valid JSON, so you can't use _json sourcetype or INDEXED_EXTRACTIONS=json

At search time you use regex and then spath to create/extract fields from json blob.

Like

your_base_query | rex field=_raw "(?<ext_json>{[^}]+}]})" | spath input=ext_json

View solution in original post

0 Karma

harsmarvania57
Ultra Champion

Hi,

Your rawdata contain timestamp Tue 21 Apr 14:16:26 BST 2020 and after that you have valid JSON, so you can't use _json sourcetype or INDEXED_EXTRACTIONS=json

At search time you use regex and then spath to create/extract fields from json blob.

Like

your_base_query | rex field=_raw "(?<ext_json>{[^}]+}]})" | spath input=ext_json
0 Karma

Shashank_87
Explorer

@harsmarvania57 Thanks for the response but how would i upload the data at first place? which sourcetype should i use?
Because if i use automatic, the timestamp field comes as a separate event

0 Karma

harsmarvania57
Ultra Champion

Create your own sourcetype Like app_json

0 Karma

Shashank_87
Explorer

@harsmarvania57 I have already tried it and as i said it creates a separate event with just a timestamp. I don't want that I want that whole thing in a single event because I need that timestamp value in my report. I have attached s screenshot where you can see there are 2 separate events but that is actually a single event in the log file

0 Karma

harsmarvania57
Ultra Champion

I can’t see any screenshot, also please provide your raw data in code format(Use 101010 button)

0 Karma

Shashank_87
Explorer

@harsmarvania57 added

0 Karma

harsmarvania57
Ultra Champion

Based on data you have provided I have created below sourcetype on Indexer, if you are ingesting data via Heavy Forwarder then you need to create below props.conf on Heavy Forwarder.

props.conf

[test_st]
LINE_BREAKER = }([\r\n]+)
MAX_TIMESTAMP_LOOKAHEAD = 28
SHOULD_LINEMERGE = false
TIME_FORMAT = %a %d %b %H:%M:%S %Z %Y

And then used search query which I have provided and it is extracting data.

0 Karma

Shashank_87
Explorer

@harsmarvania57 That actually worked. Thank you. I am getting time time and the json in same event though the _time field has not been extracted. How do i extract the time because I have to plot the graph based on time.

0 Karma

harsmarvania57
Ultra Champion

I can see time from raw data in _time, see screenshot from my lab instance https://imgur.com/a/bW5T8ok

How are you ingesting data ?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...