Splunk Search

How to extract values from my sample log?

rajgowd1
Communicator

Hi
can you help us to extract values from log like ACTION, URI and response_time

i used extract kvdelim=":" pairdelim="," but it is not extracting response time.

ACTION=DELETE,POST,GET etc
URI's = endpoints

<6>2017-01-23T19:17:45Z v204vtn756h doppler[19]: {"cf_app_id":"012b7380-c96c-46e6-a57e-b96fd1f7266c","cf_app_name":"nam-ccp-psg-sit","cf_ignored_app":false,"cf_org_id":"fd12558e-ddaf-4dd2-91b3-85f28ccd27f3","cf_org_name":"NAM-US-CCP","cf_origin":"firehose","cf_space_id":"f9e2c3b9-ff7a-46b2-b359-9ec4ec13487b","cf_space_name":"lab","deployment":"cf","event_type":"LogMessage","ip":"168.72.186.232","job":"router-partition-ee9c6bad3843f162447f","job_index":"1","level":"info","message_type":"OUT","msg":"nam-ccp-psg-sit.cfapps-gcg-nonprd.nam.nsroot.net - [23/01/2017:19:17:45 +0000] \"POST /public/sso/keepalive HTTP/1.1\" 200 0 0 \"-\" \"Apache-HttpClient/4.1.1 (java 1.5)\" 153.40.245.130:15583 x_forwarded_for:\"169.193.222.122\" x_forwarded_proto:\"http\" vcap_request_id:896fa122-a994-4ec1-6ac0-1af149ef9580 response_time:0.041984457 app_id:012b7380-c96c-46e6-a57e-b96fd1f7266c\n","origin":"router__1","source_instance":"1","source_type":"RTR","time":"2017-01-23T19:17:45Z","timestamp":1485199065878351999}

<6>2017-01-23T19:17:45Z 2ejr1t83au3 doppler[19]: {"cf_app_id":"3e0f31ee-f09c-46bf-a072-baef9e0c7763","cf_app_name":"nam-ccp-eureka-lab","cf_ignored_app":false,"cf_org_id":"dfeebb94-7a1c-4889-aa76-bb77852e434d","cf_org_name":"NAM-US-CCP","cf_origin":"firehose","cf_space_id":"b2abf80f-0543-4578-88d2-e7222f3d7b70","cf_space_name":"LAB","deployment":"cf","event_type":"LogMessage","ip":"168.72.205.254","job":"router-partition-a2833c853cfafee70104","job_index":"1","level":"info","message_type":"OUT","msg":"nam-ccp-eureka-lab.cfapps-gcg-gtdc1.citipaas-dev.dyn.nsroot.net - [23/01/2017:19:17:45 +0000] \"GET /eureka/apps/delta HTTP/1.1\" 200 0 89 \"-\" \"Java-EurekaClient/v1.4.6\" 153.40.245.130:46769 x_forwarded_for:\"168.72.205.134\" x_forwarded_proto:\"http\" vcap_request_id:4a46950c-7d18-4bed-7c98-833891c3358c response_time:0.001204662 app_id:3e0f31ee-f09c-46bf-a072-baef9e0c7763\n","origin":"router__1","source_instance":"1","source_type":"RTR","time":"2017-01-23T19:17:45Z","timestamp":1485199065824270851}
0 Karma

DalJeanis
Legend

Take your extract and put this after it

| head 5 
| rex field=_raw "(?<source>{[^}]*})"
| spath input=source

Look at the output fields and tell me what you see.

0 Karma

DalJeanis
Legend

ON my system it successfully extracted these values -

cf_app_id   012b7380-c96c-46e6-a57e-b96fd1f7266c
cf_app_name nam-ccp-psg-sit
cf_ignored_app  FALSE
cf_org_id   fd12558e-ddaf-4dd2-91b3-85f28ccd27f3
cf_org_name NAM-US-CCP
cf_origin   firehose
cf_space_id f9e2c3b9-ff7a-46b2-b359-9ec4ec13487b
cf_space_name   lab
deployment  cf
event_type  LogMessage

That's not all the fields you need, but I need to know whether your system operates as mine does, or if there's another issue as well.

0 Karma

rajgowd1
Communicator

the fields above you mentioned,those are already extracted in splunk machine.
particularly i was looking for these key and pair values

ACTION=POST
URI=/public/sso/keepalive
response_time=0.041984457

0 Karma

rajgowd1
Communicator

i see these after running below search

myindex| cf_org_name="" cf_space_name="" cf_app_name="" | head 5| rex field=_raw "(?{[^}]})" | spath input=test| top limit=20 test

{"cf_app_id":"ffbf3337-4e42-4cba-8fc7-b803c780e245","cf_app_name":"nam-ccp-fintech-idssink","cf_ignored_app":false,"cf_org_id":"67caccf2-a9f9-4a75-ae14-29f853f34c66","cf_org_name":"NAM-US-FINTECH","cf_origin":"firehose","cf_space_id":"ca745890-35a1-4e50-9043-688635d00f81","cf_space_name":"CCP-SIT4","deployment":"cf","event_type":"LogMessage","ip":"168.72.205.52","job":"diego_cell-partition-3d73afa5a8e5acc6f4c1","job_index":"5","level":"info","message_type":"OUT","msg":"Exit status 0","origin":"rep","source_instance":"0","source_type":"HEALTH","time":"2017-01-25T18:43:28Z","timestamp":1485369808983251121}

{"cf_app_id":"b77b7b3b-5bad-44f9-8cfd-14b28cd6f6ba","cf_app_name":"CCP-EUREKA-DEV2","cf_ignored_app":false,"cf_org_id":"67caccf2-a9f9-4a75-ae14-29f853f34c66","cf_org_name":"NAM-US-FINTECH","cf_origin":"firehose","cf_space_id":"100f814a-2e29-43f8-8f05-3b52ce7a8a94","cf_space_name":"CARDS-MS-DEV2","deployment":"cf","event_type":"LogMessage","ip":"168.72.205.254","job":"router-partition-a2833c853cfafee70104","job_index":"1","level":"info","message_type":"OUT","msg":"ccp-eureka-dev2.cfapps-gcg-gtdc1.citipaas-dev.dyn.nsroot.net - [25/01/2017:18:43:28 +0000] \"POST /eureka/peerreplication/batch/ HTTP/1.1\" 200 224 37 \"-\" \"Java-EurekaClient-Replication/v1.4.6\" 153.40.245.130:46346 x_forwarded_for:\"168.72.205.77\" x_forwarded_proto:\"http\" vcap_request_id:60cfd3ed-b13c-4abe-6eb3-4d2902656ead response_time:0.001880901 app_id:b77b7b3b-5bad-44f9-8cfd-14b28cd6f6ba\n","origin":"router__1","source_instance":"1","source_type":"RTR","time":"2017-01-25T18:43:28Z","timestamp":1485369808995466909}

{"cf_app_id":"46486cba-6d5a-4fe3-9d0d-01b7d5f24d53","cf_app_name":"crs-fcom-contserv-plat","cf_ignored_app":false,"cf_org_id":"0f0d0b56-fff2-48e8-9cf4-8d0c1259e910","cf_org_name":"NAM-US-CRS","cf_origin":"firehose","cf_space_id":"37a1eda4-58ca-4a43-852a-ad77752a3227","cf_space_name":"SIT3","deployment":"cf","event_type":"LogMessage","ip":"168.72.186.89","job":"diego_cell-partition-ee9c6bad3843f162447f","job_index":"22","level":"info","message_type":"OUT","msg":"DEBUG [l-4626-thread-7] c.c.ccp.localcache.impl.CacheMap c.c.c.l.i.CacheMap.put(CacheMap.java:52) - |||||||||111#C#459","origin":"rep","source_instance":"0","source_type":"APP","time":"2017-01-25T18:43:28Z","timestamp":1485369808996339080}

0 Karma

rajgowd1
Communicator

i got these fields

cf_app_id
cf_app_name
cf_org_name
cf_ignored_app
cf_org_id
cf_origin
cf_session_id
cf_space_id
cf_space_name
deployment
event_type
ip
job
job_index
level
message_type
msg
origin
source_instance
source_type

0 Karma

harishhari390
New Member

Hai, I am also looking for the same solution similar to what you discussed. Did you got to know how to achieve this.

0 Karma

DalJeanis
Legend

Please post what you DO see, NOT what you don't.

I can't figure out where your code is breaking if I don't know what your code is doing right.

0 Karma

somesoni2
Revered Legend

Try something like this. The data that you need is under msg field of embedded json data.

your base search | rex "msg\":\"([^\"]+)\"(?<Action>\w+)\s+(?<URI>\S+).+response_time:(?<response_time>\S+)" 

rajgowd1
Communicator

i just tried but it is not showing extracted fields in left side.

0 Karma

somesoni2
Revered Legend

How about this?

your base search | rex "msg\":([^\"]+)\"(?<Action>\w+)\s+(?<URI>\S+)" | rex "response_time:(?<response_time>\S+)" 
0 Karma

DalJeanis
Legend

You are looking for the spath command, which pulls data out of JSON format. Here's a sample. The first two lines were how I put your first sample event into the system. The third line pulls the JSON data out of the _raw event into a field named source, and the last line decodes the JSON data.

| makeresults
| eval _raw=
"\<6\>2017-01-23T19:17:45Z v204vtn756h doppler\[19\]: {\"cf_app_id\":\"012b7380-c96c-46e6-a57e-b96fd1f7266c\",\"cf_app_name\":\"nam-ccp-psg-sit\",\"cf_ignored_app\":false,\"cf_org_id\":\"fd12558e-ddaf-4dd2-91b3-85f28ccd27f3\",\"cf_org_name\":\"NAM-US-CCP\",\"cf_origin\":\"firehose\",\"cf_space_id\":\"f9e2c3b9-ff7a-46b2-b359-9ec4ec13487b\",\"cf_space_name\":\"lab\",\"deployment\":\"cf\",\"event_type\":\"LogMessage\",\"ip\":\"168.72.186.232\",\"job\":\"router-partition-ee9c6bad3843f162447f\",\"job_index\":\"1\",\"level\":\"info\",\"message_type\":\"OUT\",\"msg\":\"nam-ccp-psg-sit.cfapps-gcg-nonprd.nam.nsroot.net - \[23/01/2017:19:17:45 \+0000\] \\\"POST /public/sso/keepalive HTTP/1.1\\\" 200 0 0 \\\"-\\\" \\\"Apache-HttpClient/4.1.1 (java 1.5)\\\" 153.40.245.130:15583 x_forwarded_for:\\\"169.193.222.122\\\" x_forwarded_proto:\\\"http\\\" vcap_request_id:896fa122-a994-4ec1-6ac0-1af149ef9580 response_time:0.041984457 app_id:012b7380-c96c-46e6-a57e-b96fd1f7266c\\n\",\"origin\":\"router__1\",\"source_instance\":\"1\",\"source_type\":\"RTR\",\"time\":\"2017-01-23T19:17:45Z\",\"timestamp\":1485199065878351999}"

| rex field=_raw "(?<source>{[^}]*})"
| spath input=source

Judging from the output, there may be some issue either with your JSON data or with my manual escaping of the special characters, after message_type and before msg.

Try those last two lines against your input, and see if they work. If not, then I'll have to debug your JSON data.

0 Karma

rajgowd1
Communicator

Hi,
the output i pasted,that is from splunk log.
from the output,i would like to extract

ACTION=POST
URI=/public/sso/keepalive
response_time=0.041984457

0 Karma

DalJeanis
Legend

so try those last two lines and see which values get extracted.

 | rex field=_raw "(?<source>{[^}]*})"
 | spath input=source
0 Karma

rajgowd1
Communicator

thanks DalJeanis,i tried but its not working.

i was able to get it from field extractions , here it is
rex field=_raw "response_time:(?P[^ ]+)"

0 Karma

DalJeanis
Legend

As long as you don't need any of the rest of the JSON data, that's a better way to go.


No, after re-reading your question, it isn't. If you want all the fields to be available, then you need to unpack that JSON data.

I'll post a new answer and we'll work from there.

0 Karma

rajgowd1
Communicator

not sure this question is properly posted in forum or not.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...