Splunk Search

Extract access_combined from JSON msg field

johnansett
Communicator

Hello! I have JSON events coming from Pivotal Cloud Foundry. Included in the JSON is the 'msg' field which includes what looks like a access_combined event:

{   [-] 
     cf_app_id:  caf9c86b-8672-48b8-90eb-d04b96e36cf3   
     cf_app_name:    app-name-1
     cf_ignored_app:     false  
     cf_org_id:  82e3a4a8-a40c-48bc-82e8-488acd0976ce   
     cf_org_name:    orgname    
     cf_origin:  firehose   
     cf_space_id:    df0e696d-93ca-4c69-ba91-e69ae8d2ab15   
     cf_space_name:  qa-web 
     deployment:     p-isolation-segment-f2a8ba4dfa4dca195b26   
     event_type:     LogMessage 
     ip:     10.1.1.1   
     job:    isolated_router    
     job_index:  fb34ddbe-0f9d-4645-847b-17d833fee1b1   
     message_type:   OUT    
     msg:    app1.company.com - [2019-06-03T18:42:33.399+0000] "PUT /api/updateQueue HTTP/1.1" 200 3394 3394 "https://qaapps2.company.com/app/app2/queue/venmie" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/74.0.3729.169 Safari/537.36" "10.10.10.254:43698" "10.10.10.24:61044" x_forwarded_for:"10.10.10.254, 10.10.10.254" x_forwarded_proto:"https" vcap_request_id:"4cedcf4b-8651-4a55-692d-77a25b790381" response_time:1.801544121 app_id:"caf9c86b-8672-48b8-90eb-d04b96e36cf3" app_index:"0" x_b3_traceid:"a7aadd84ddbc3b8c" x_b3_spanid:"a7aadd84ddbc3b8c" x_b3_parentspanid:"-"

     origin:     gorouter   
     source_instance:    1  
     source_type:    RTR    
     timestamp:  1559587355201956600    
}

I would like to expand out the 'msg' field and then extract the events - e.g. status (200), URI (https://qaapps2.company.com/app/app2/queue/venmie). Ideally I'd like Splunk to do this automatically as it has these fields defined in the sourcetype=access_combined.

What options do I have to do this?

Thanks!

0 Karma

martynoconnor
Communicator

HI there,

So while you are right in as much that there is an out of the box sourcetype definition for access_combined, this data will likely be coming into Splunk not under that sourcetype (and even if it was, the field extractions wouldn't work as it's not actually access_combined. You could achieve the extraction of the fields however using either custom EXTRACT-name definitions in props.conf (the better way to do things if you have, or will have, more than one indexer), or through the search itself using the rex command to extract the fields at search time.

An example of this might be something like

index=<your_index> sourcetype=<your_sourcetype>
| rex field=<the_field_to_extract_from> "<the_regular_expression_with_named_capture_group>"
0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...