Splunk Search

How to extract values from my sample log?

rajgowd1
Communicator

Hi
can you help us to extract values from log like ACTION, URI and response_time

i used extract kvdelim=":" pairdelim="," but it is not extracting response time.

ACTION=DELETE,POST,GET etc
URI's = endpoints

<6>2017-01-23T19:17:45Z v204vtn756h doppler[19]: {"cf_app_id":"012b7380-c96c-46e6-a57e-b96fd1f7266c","cf_app_name":"nam-ccp-psg-sit","cf_ignored_app":false,"cf_org_id":"fd12558e-ddaf-4dd2-91b3-85f28ccd27f3","cf_org_name":"NAM-US-CCP","cf_origin":"firehose","cf_space_id":"f9e2c3b9-ff7a-46b2-b359-9ec4ec13487b","cf_space_name":"lab","deployment":"cf","event_type":"LogMessage","ip":"168.72.186.232","job":"router-partition-ee9c6bad3843f162447f","job_index":"1","level":"info","message_type":"OUT","msg":"nam-ccp-psg-sit.cfapps-gcg-nonprd.nam.nsroot.net - [23/01/2017:19:17:45 +0000] \"POST /public/sso/keepalive HTTP/1.1\" 200 0 0 \"-\" \"Apache-HttpClient/4.1.1 (java 1.5)\" 153.40.245.130:15583 x_forwarded_for:\"169.193.222.122\" x_forwarded_proto:\"http\" vcap_request_id:896fa122-a994-4ec1-6ac0-1af149ef9580 response_time:0.041984457 app_id:012b7380-c96c-46e6-a57e-b96fd1f7266c\n","origin":"router__1","source_instance":"1","source_type":"RTR","time":"2017-01-23T19:17:45Z","timestamp":1485199065878351999}

<6>2017-01-23T19:17:45Z 2ejr1t83au3 doppler[19]: {"cf_app_id":"3e0f31ee-f09c-46bf-a072-baef9e0c7763","cf_app_name":"nam-ccp-eureka-lab","cf_ignored_app":false,"cf_org_id":"dfeebb94-7a1c-4889-aa76-bb77852e434d","cf_org_name":"NAM-US-CCP","cf_origin":"firehose","cf_space_id":"b2abf80f-0543-4578-88d2-e7222f3d7b70","cf_space_name":"LAB","deployment":"cf","event_type":"LogMessage","ip":"168.72.205.254","job":"router-partition-a2833c853cfafee70104","job_index":"1","level":"info","message_type":"OUT","msg":"nam-ccp-eureka-lab.cfapps-gcg-gtdc1.citipaas-dev.dyn.nsroot.net - [23/01/2017:19:17:45 +0000] \"GET /eureka/apps/delta HTTP/1.1\" 200 0 89 \"-\" \"Java-EurekaClient/v1.4.6\" 153.40.245.130:46769 x_forwarded_for:\"168.72.205.134\" x_forwarded_proto:\"http\" vcap_request_id:4a46950c-7d18-4bed-7c98-833891c3358c response_time:0.001204662 app_id:3e0f31ee-f09c-46bf-a072-baef9e0c7763\n","origin":"router__1","source_instance":"1","source_type":"RTR","time":"2017-01-23T19:17:45Z","timestamp":1485199065824270851}
0 Karma

DalJeanis
Legend

Take your extract and put this after it

| head 5 
| rex field=_raw "(?<source>{[^}]*})"
| spath input=source

Look at the output fields and tell me what you see.

0 Karma

DalJeanis
Legend

ON my system it successfully extracted these values -

cf_app_id   012b7380-c96c-46e6-a57e-b96fd1f7266c
cf_app_name nam-ccp-psg-sit
cf_ignored_app  FALSE
cf_org_id   fd12558e-ddaf-4dd2-91b3-85f28ccd27f3
cf_org_name NAM-US-CCP
cf_origin   firehose
cf_space_id f9e2c3b9-ff7a-46b2-b359-9ec4ec13487b
cf_space_name   lab
deployment  cf
event_type  LogMessage

That's not all the fields you need, but I need to know whether your system operates as mine does, or if there's another issue as well.

0 Karma

rajgowd1
Communicator

the fields above you mentioned,those are already extracted in splunk machine.
particularly i was looking for these key and pair values

ACTION=POST
URI=/public/sso/keepalive
response_time=0.041984457

0 Karma

rajgowd1
Communicator

i see these after running below search

myindex| cf_org_name="" cf_space_name="" cf_app_name="" | head 5| rex field=_raw "(?{[^}]})" | spath input=test| top limit=20 test

{"cf_app_id":"ffbf3337-4e42-4cba-8fc7-b803c780e245","cf_app_name":"nam-ccp-fintech-idssink","cf_ignored_app":false,"cf_org_id":"67caccf2-a9f9-4a75-ae14-29f853f34c66","cf_org_name":"NAM-US-FINTECH","cf_origin":"firehose","cf_space_id":"ca745890-35a1-4e50-9043-688635d00f81","cf_space_name":"CCP-SIT4","deployment":"cf","event_type":"LogMessage","ip":"168.72.205.52","job":"diego_cell-partition-3d73afa5a8e5acc6f4c1","job_index":"5","level":"info","message_type":"OUT","msg":"Exit status 0","origin":"rep","source_instance":"0","source_type":"HEALTH","time":"2017-01-25T18:43:28Z","timestamp":1485369808983251121}

{"cf_app_id":"b77b7b3b-5bad-44f9-8cfd-14b28cd6f6ba","cf_app_name":"CCP-EUREKA-DEV2","cf_ignored_app":false,"cf_org_id":"67caccf2-a9f9-4a75-ae14-29f853f34c66","cf_org_name":"NAM-US-FINTECH","cf_origin":"firehose","cf_space_id":"100f814a-2e29-43f8-8f05-3b52ce7a8a94","cf_space_name":"CARDS-MS-DEV2","deployment":"cf","event_type":"LogMessage","ip":"168.72.205.254","job":"router-partition-a2833c853cfafee70104","job_index":"1","level":"info","message_type":"OUT","msg":"ccp-eureka-dev2.cfapps-gcg-gtdc1.citipaas-dev.dyn.nsroot.net - [25/01/2017:18:43:28 +0000] \"POST /eureka/peerreplication/batch/ HTTP/1.1\" 200 224 37 \"-\" \"Java-EurekaClient-Replication/v1.4.6\" 153.40.245.130:46346 x_forwarded_for:\"168.72.205.77\" x_forwarded_proto:\"http\" vcap_request_id:60cfd3ed-b13c-4abe-6eb3-4d2902656ead response_time:0.001880901 app_id:b77b7b3b-5bad-44f9-8cfd-14b28cd6f6ba\n","origin":"router__1","source_instance":"1","source_type":"RTR","time":"2017-01-25T18:43:28Z","timestamp":1485369808995466909}

{"cf_app_id":"46486cba-6d5a-4fe3-9d0d-01b7d5f24d53","cf_app_name":"crs-fcom-contserv-plat","cf_ignored_app":false,"cf_org_id":"0f0d0b56-fff2-48e8-9cf4-8d0c1259e910","cf_org_name":"NAM-US-CRS","cf_origin":"firehose","cf_space_id":"37a1eda4-58ca-4a43-852a-ad77752a3227","cf_space_name":"SIT3","deployment":"cf","event_type":"LogMessage","ip":"168.72.186.89","job":"diego_cell-partition-ee9c6bad3843f162447f","job_index":"22","level":"info","message_type":"OUT","msg":"DEBUG [l-4626-thread-7] c.c.ccp.localcache.impl.CacheMap c.c.c.l.i.CacheMap.put(CacheMap.java:52) - |||||||||111#C#459","origin":"rep","source_instance":"0","source_type":"APP","time":"2017-01-25T18:43:28Z","timestamp":1485369808996339080}

0 Karma

rajgowd1
Communicator

i got these fields

cf_app_id
cf_app_name
cf_org_name
cf_ignored_app
cf_org_id
cf_origin
cf_session_id
cf_space_id
cf_space_name
deployment
event_type
ip
job
job_index
level
message_type
msg
origin
source_instance
source_type

0 Karma

harishhari390
New Member

Hai, I am also looking for the same solution similar to what you discussed. Did you got to know how to achieve this.

0 Karma

DalJeanis
Legend

Please post what you DO see, NOT what you don't.

I can't figure out where your code is breaking if I don't know what your code is doing right.

0 Karma

somesoni2
Revered Legend

Try something like this. The data that you need is under msg field of embedded json data.

your base search | rex "msg\":\"([^\"]+)\"(?<Action>\w+)\s+(?<URI>\S+).+response_time:(?<response_time>\S+)" 

rajgowd1
Communicator

i just tried but it is not showing extracted fields in left side.

0 Karma

somesoni2
Revered Legend

How about this?

your base search | rex "msg\":([^\"]+)\"(?<Action>\w+)\s+(?<URI>\S+)" | rex "response_time:(?<response_time>\S+)" 
0 Karma

DalJeanis
Legend

You are looking for the spath command, which pulls data out of JSON format. Here's a sample. The first two lines were how I put your first sample event into the system. The third line pulls the JSON data out of the _raw event into a field named source, and the last line decodes the JSON data.

| makeresults
| eval _raw=
"\<6\>2017-01-23T19:17:45Z v204vtn756h doppler\[19\]: {\"cf_app_id\":\"012b7380-c96c-46e6-a57e-b96fd1f7266c\",\"cf_app_name\":\"nam-ccp-psg-sit\",\"cf_ignored_app\":false,\"cf_org_id\":\"fd12558e-ddaf-4dd2-91b3-85f28ccd27f3\",\"cf_org_name\":\"NAM-US-CCP\",\"cf_origin\":\"firehose\",\"cf_space_id\":\"f9e2c3b9-ff7a-46b2-b359-9ec4ec13487b\",\"cf_space_name\":\"lab\",\"deployment\":\"cf\",\"event_type\":\"LogMessage\",\"ip\":\"168.72.186.232\",\"job\":\"router-partition-ee9c6bad3843f162447f\",\"job_index\":\"1\",\"level\":\"info\",\"message_type\":\"OUT\",\"msg\":\"nam-ccp-psg-sit.cfapps-gcg-nonprd.nam.nsroot.net - \[23/01/2017:19:17:45 \+0000\] \\\"POST /public/sso/keepalive HTTP/1.1\\\" 200 0 0 \\\"-\\\" \\\"Apache-HttpClient/4.1.1 (java 1.5)\\\" 153.40.245.130:15583 x_forwarded_for:\\\"169.193.222.122\\\" x_forwarded_proto:\\\"http\\\" vcap_request_id:896fa122-a994-4ec1-6ac0-1af149ef9580 response_time:0.041984457 app_id:012b7380-c96c-46e6-a57e-b96fd1f7266c\\n\",\"origin\":\"router__1\",\"source_instance\":\"1\",\"source_type\":\"RTR\",\"time\":\"2017-01-23T19:17:45Z\",\"timestamp\":1485199065878351999}"

| rex field=_raw "(?<source>{[^}]*})"
| spath input=source

Judging from the output, there may be some issue either with your JSON data or with my manual escaping of the special characters, after message_type and before msg.

Try those last two lines against your input, and see if they work. If not, then I'll have to debug your JSON data.

0 Karma

rajgowd1
Communicator

Hi,
the output i pasted,that is from splunk log.
from the output,i would like to extract

ACTION=POST
URI=/public/sso/keepalive
response_time=0.041984457

0 Karma

DalJeanis
Legend

so try those last two lines and see which values get extracted.

 | rex field=_raw "(?<source>{[^}]*})"
 | spath input=source
0 Karma

rajgowd1
Communicator

thanks DalJeanis,i tried but its not working.

i was able to get it from field extractions , here it is
rex field=_raw "response_time:(?P[^ ]+)"

0 Karma

DalJeanis
Legend

As long as you don't need any of the rest of the JSON data, that's a better way to go.


No, after re-reading your question, it isn't. If you want all the fields to be available, then you need to unpack that JSON data.

I'll post a new answer and we'll work from there.

0 Karma

rajgowd1
Communicator

not sure this question is properly posted in forum or not.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...