Getting Data In

How to Extract fields with multiple delimiters?

sureshchinta
Explorer

In the following sample log statement:

May  5 13:23:25 172.29.196.32 May 05 13:23:24 Production_EXT_P1 [0x80000001][xsltmsg][notice] mpgw(cfw): trans(7649887)[response][206.201.102.59] gtid(7649887): clientCN:interface2018.meddata.org|version:2|HTTPVerb:GET|inTime:2015-05-05T13:23:24.045|uri:/member/alerts?&memberID=12314289&offset=0&limit=100&numberOfDays=365|reqLatency:3|appLatency:658|resLatency:0|httpCode:200

Log data from clientCN has been extracted into a field, say, DX_XSLTLog. Therefore DX_XSLTLog field has the following:

clientCN:interface2018.meddata.org|version:2|HTTPVerb:GET|inTime:2015-05-05T13:23:24.045|uri:/member/alerts?&memberID=12314289&offset=0&limit=100&numberOfDays=365|reqLatency:3|appLatency:658|resLatency:0|httpCode:200

Questions are..

  • How to remove the uri params and just get the context itself, /member/alerts, in this case.
  • How to tokenize this key value data into columns and group by clientCN, uri and httpCode for a count

P.S: I've tried with extract pairdelim="|", kvdelim=":" with no luck (it removed the inTime and clientCN fields - not sure why)

Tags (2)
1 Solution

rsennett_splunk
Splunk Employee
Splunk Employee

You have essentially three delimiters in this event.
you have a pipe | which breaks out sections within your DX_XLSTLog section beginning with clientCN
You've got a colon : which sometimes is the delimiter for fields, except inside the uri where you've got key=value
Splunk will extract the key=value for you...

The rest would be like this (in props.conf for this source|sourcetype)

EXTRACT-DX_XSLTLog = (?<DX_XSLTLog>clientCN:.+) 
EXTRACT-ClientCN =  clientCN:(?<clientCN>[^\|]+)\| in DX_XSLTLog
EXTRACT-context = uri:(?<uri>[^\?]+) in DX_XSLTLog
EXTRACT-httpCode = httpCode:(?<httpCode>\d{3}+) in DX_XSLTLog

Because you say you've already extracted DX_XSLTLog - I did the same and then used it as a reference for the clientCN, uri and httpCode fields. however because you have key:value pairs you can just as easily drop the <in fieldname> part and just let it pull it out of the raw.

As for your search... unless I'm missing something it would be this:

   ... |stats count by clientCN uri httpCode

Since you do have the delimiters clearly marked (within the individual sections... which would need to be extracted to make it easier) you could use transforms.conf to pull out the key/value pairs... but you don't really have to unless there are sometimes additional fields that show up or disappear and you don't want to refer to them statically as I have (you don't mention this at all)

As I mentioned... you will get the uri params automatgically. if you want to pull the fields after the full uri, you'll need to anchor it to distinguish between that run of pipe delimited stuff and the previous one...

You'll have to explain the nuances of the data though (are these always the fields or are there some that show up only when they have a value) and we can walk through making this work for you...

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!

View solution in original post

sureshchinta
Explorer

Thank you for your input, it not just helped me to understand better but also improvise on the thought process.

Perhaps as a follow-up question. Right now the search returns the output data as a table in columns named as, say :

Client Name, Context (URI), Transaction volume

Will it be possible to show the Peak volume /minute and peak volume /hour and maximum concurrent requests values into the same search string ?

Thanks

0 Karma

lloydmartiz
Engager

Hi I have a similar issue, please help my events are as follows
@@XYMONDCHK-V1||hostabc|disk|10.100.40.91|green||green|1582637372|1580141124|1582639172|0|0||0|status qwehwac01.disk green Tue Feb 25 06:29:58 MST 2020 - Filesystems ok\nFilesystem 1024-blocks Used Available Capacity Mounted on\ntmpfs 32894796 76 32894720 1% /dev/shm\ntmpfs 32894796 58624 32836172 1% /run\ntmpfs 32894796 0 32894796 0% /sys/fs/cgroup\n/dev/mapper/rhel-root 116457924 2866116 113591808 3% /\n/dev/sda1 1038336 234404 803932 23% /boot\n|||0|0

where i need overview data extracted as:
||hostabc|disk|10.100.40.91|green||green|1582637372|1580141124|1582639172|0|0||0|status qwehwac01.disk green Tue Feb 25 06:29:58 MST 2020 - Filesystems ok

Drill down data extracted as:
Filesystem 1024-blocks Used Available Capacity Mounted on\ntmpfs 32894796 76 32894720 1% /dev/shm\ntmpfs 32894796 58624 32836172 1% /run\ntmpfs 32894796 0 32894796 0% /sys/fs/cgroup\n/dev/mapper/rhel-root 116457924 2866116 113591808 3% /\n/dev/sda1 1038336 234404 803932 23% /boot\n

in above event we can skip headers Filesystem 1024-blocks Used Available Capacity Mounted on and hardcode it... we can see overview data delimited by | and detailed by \n. Could someone please help me extract this using props.conf

0 Karma

rsennett_splunk
Splunk Employee
Splunk Employee

Yes. Of course... this should get you started: http://answers.splunk.com/answers/152322/max-of-peak-hour-volume.html
Our SEO is very good... so I googled: splunk peak volume per hour
that was the first thing that came up.
check out the sample code... try to plug in your stuff.
If you have trouble, open a new question with the specifics. Be sure to provide event samples and your code if you create a new question.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!

rsennett_splunk
Splunk Employee
Splunk Employee

And please accept my answer if it worked for you for the first part. Thanks.

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!

rsennett_splunk
Splunk Employee
Splunk Employee

You have essentially three delimiters in this event.
you have a pipe | which breaks out sections within your DX_XLSTLog section beginning with clientCN
You've got a colon : which sometimes is the delimiter for fields, except inside the uri where you've got key=value
Splunk will extract the key=value for you...

The rest would be like this (in props.conf for this source|sourcetype)

EXTRACT-DX_XSLTLog = (?<DX_XSLTLog>clientCN:.+) 
EXTRACT-ClientCN =  clientCN:(?<clientCN>[^\|]+)\| in DX_XSLTLog
EXTRACT-context = uri:(?<uri>[^\?]+) in DX_XSLTLog
EXTRACT-httpCode = httpCode:(?<httpCode>\d{3}+) in DX_XSLTLog

Because you say you've already extracted DX_XSLTLog - I did the same and then used it as a reference for the clientCN, uri and httpCode fields. however because you have key:value pairs you can just as easily drop the <in fieldname> part and just let it pull it out of the raw.

As for your search... unless I'm missing something it would be this:

   ... |stats count by clientCN uri httpCode

Since you do have the delimiters clearly marked (within the individual sections... which would need to be extracted to make it easier) you could use transforms.conf to pull out the key/value pairs... but you don't really have to unless there are sometimes additional fields that show up or disappear and you don't want to refer to them statically as I have (you don't mention this at all)

As I mentioned... you will get the uri params automatgically. if you want to pull the fields after the full uri, you'll need to anchor it to distinguish between that run of pipe delimited stuff and the previous one...

You'll have to explain the nuances of the data though (are these always the fields or are there some that show up only when they have a value) and we can walk through making this work for you...

With Splunk... the answer is always "YES!". It just might require more regex than you're prepared for!
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...