I am trying to extract key value pairs from JSON events using rex command
mysearch | rex field=_raw max_match=0 "\"(?<Key>\b\w+[^\":]*)\":(?!\s*{\[)\"*(?<Value>(?!\[{|{|\[)[^(,|}|\")]*)"
I have a single column CSV lookup with all the key names I am interested in
| inputlookup my_fields_json.csv | fields FieldName
Is there a way to use the lookup to make my rex command regular expression dynamic so I only extract the fields I am interested in?
I finally came to a workable solution using map
| inputlookup my_fields_xml.csv
| stats list(FieldName) as FieldName delim="|"
| nomv FieldName
| eval KeyRegex = "\"(?<FieldName>(" + FieldName + "))\":(?!\s*{\[)\"*(?<Value>(?!\[+{|{+|null)[^(,|}|\")]*)"
| fields KeyRegex
| map search="search index=index1
| rex field=_raw max_match=0 $KeyRegex$....."
I finally came to a workable solution using map
| inputlookup my_fields_xml.csv
| stats list(FieldName) as FieldName delim="|"
| nomv FieldName
| eval KeyRegex = "\"(?<FieldName>(" + FieldName + "))\":(?!\s*{\[)\"*(?<Value>(?!\[+{|{+|null)[^(,|}|\")]*)"
| fields KeyRegex
| map search="search index=index1
| rex field=_raw max_match=0 $KeyRegex$....."
If field are not extracted from json events then you can use spath command to extract field value pairs. Then you can filter fields.
index=index | spath | fields foo, bar
If you filter fields from csv lookup only then,
index=index | spath | fields [| inputlookup my_fields_json.csv | fields FieldName | mvcombine delim="," FieldName | nomv FieldName | return $FieldName]
I am working with events having nested JSON. Splunk extracts top level JSON but there's an array with nested objects. It does not have consistent structure inside it and inside it Splunk does not extract the fields very well (it does but they appear like Parameters{}.Customer.CustomerId. Not all events have some structure of customerId. So I am trying to extract it using regex)
{
"TimeStamp": "2020-03-09 12:01:39.451",
"Type": "Info",
"Message": "Some message",
"Host": "SERVER01", ,
"Parameters": [{
"Customer": {
"CusmerId": "888000000587",
"Name": "sales@abc.com",
}
}, false]
}
Why do you need to rex a JSON? Splunk should be parsing those for you. Maybe this will help:
https://answers.splunk.com/answers/556279/why-would-indexed-extractionsjson-in-propsconf-be.html
The reason I am trying to parse JSON using regex is that I have nested JSON objects with dynamic structure. I would like to be able to find all key value pairs in the events regardless of their depth in the raw JSON. Splunk is parsing those objects but as I said they have dynamic structure and do not have a consistent object hierarchy
For example
Parent{}.Customer.RelationshipId
Parent{}.RelationshipId
Parent{}.Order.Customer.RelationshipId and so on.