Splunk Search

Data filtration at field level using SED option

gvnd
Path Finder

Hi,
I am new to splunk.. I want to filter data at fields level instead of event levels before indexing my data. data is pipe(|) separated.
I need only few fields from below data, remaining fields are not required.
Example::
event1- 123|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
event2- 234|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
event3- 456|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12

Desired output::
event1- 123|field3|field6|field11
event2- 234|field3|field6|field11
event3- 456|field3|field6|field11

Please suggest me the proper regex which works with SED option in props.conf file to extract only these fields.

Thanks in advance...

Tags (3)
0 Karma
1 Solution

woodcock
Esteemed Legend

For demonstration:

| makeresults 
| eval raw="123|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
234|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
456|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12"
| makemv delim="
" raw
| mvexpand raw
| rename raw AS _raw
| rex mode=sed "s/^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$/\1\2\3\4/"

Therefore use:

SEDCMD-fields_0_3_6_11 = s/^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$/\1\2\3\4/

View solution in original post

woodcock
Esteemed Legend

For demonstration:

| makeresults 
| eval raw="123|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
234|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12
456|field1|field2|field3|field4|field5|field6|field7|field8|field9|field10|field11|field12"
| makemv delim="
" raw
| mvexpand raw
| rename raw AS _raw
| rex mode=sed "s/^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$/\1\2\3\4/"

Therefore use:

SEDCMD-fields_0_3_6_11 = s/^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$/\1\2\3\4/

gvnd
Path Finder

Thanks for quick response..
Could you please explain the meaning of ::::: .$/\1\2\3\4/

0 Karma

woodcock
Esteemed Legend

Go to RegEx101.com and enter the first portion ^([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){2}([^\|]*\|)(?:[^\|]*\|){4}([^\|]*).*$ and it will show you what that does. The dollar sign anchors to the end of the string. The \# dereferences a capture group so \1 gives the value of the first capture group, etc.

0 Karma

gvnd
Path Finder

Sorry, still I didn't get the point. and also is it possible to give field names to that extracted fields.? For example::
"ONE" for first field i.e 123,
"TWO" for second field i.e field3,
"THREE" for third field i.e field6,
"FOUR" for fourth field i.e field11 etc...

And also don't we need '//g' option to replace empty strings in events with SEDCMD syntax(SEDCMD-=s///g)

Thanks for your patience..

0 Karma

woodcock
Esteemed Legend

A RegExt that begins with ^ and ends with $ matches THE ENTIRE STRING so we only need 1 match (i.e. no g on the end). We are staying replace THE ENTIRE STRING with the 4 captured pieces, back to back. Run it on RegEx101.com and it walks you through each piece.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...