Extract access_combined from JSON msg field

johnansett · ‎06-03-2019

Hello! I have JSON events coming from Pivotal Cloud Foundry. Included in the JSON is the 'msg' field which includes what looks like a access_combined event:

{   [-] 
     cf_app_id:  caf9c86b-8672-48b8-90eb-d04b96e36cf3   
     cf_app_name:    app-name-1
     cf_ignored_app:     false  
     cf_org_id:  82e3a4a8-a40c-48bc-82e8-488acd0976ce   
     cf_org_name:    orgname    
     cf_origin:  firehose   
     cf_space_id:    df0e696d-93ca-4c69-ba91-e69ae8d2ab15   
     cf_space_name:  qa-web 
     deployment:     p-isolation-segment-f2a8ba4dfa4dca195b26   
     event_type:     LogMessage 
     ip:     10.1.1.1   
     job:    isolated_router    
     job_index:  fb34ddbe-0f9d-4645-847b-17d833fee1b1   
     message_type:   OUT    
     msg:    app1.company.com - [2019-06-03T18:42:33.399+0000] "PUT /api/updateQueue HTTP/1.1" 200 3394 3394 "https://qaapps2.company.com/app/app2/queue/venmie" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/74.0.3729.169 Safari/537.36" "10.10.10.254:43698" "10.10.10.24:61044" x_forwarded_for:"10.10.10.254, 10.10.10.254" x_forwarded_proto:"https" vcap_request_id:"4cedcf4b-8651-4a55-692d-77a25b790381" response_time:1.801544121 app_id:"caf9c86b-8672-48b8-90eb-d04b96e36cf3" app_index:"0" x_b3_traceid:"a7aadd84ddbc3b8c" x_b3_spanid:"a7aadd84ddbc3b8c" x_b3_parentspanid:"-"

     origin:     gorouter   
     source_instance:    1  
     source_type:    RTR    
     timestamp:  1559587355201956600    
}

I would like to expand out the 'msg' field and then extract the events - e.g. status (200), URI (https://qaapps2.company.com/app/app2/queue/venmie). Ideally I'd like Splunk to do this automatically as it has these fields defined in the sourcetype=access_combined.

What options do I have to do this?

Thanks!

martynoconnor · ‎06-03-2019

HI there,

So while you are right in as much that there is an out of the box sourcetype definition for access_combined, this data will likely be coming into Splunk not under that sourcetype (and even if it was, the field extractions wouldn't work as it's not actually access_combined. You could achieve the extraction of the fields however using either custom EXTRACT-name definitions in props.conf (the better way to do things if you have, or will have, more than one indexer), or through the search itself using the rex command to extract the fields at search time.

An example of this might be something like

index=<your_index> sourcetype=<your_sourcetype>
| rex field=<the_field_to_extract_from> "<the_regular_expression_with_named_capture_group>"

Extract access_combined from JSON msg field

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!