Splunk Search

Regex to select string from raw data

pdash
Path Finder

Hi I want to extract events that have a specific site name in the raw data. How to extract these events?

Here are my props.conf and transforms.conf

props.conf

[iis]
TRANSFORMS-set= setnull,setparsing

transforms.conf

[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue

[setparsing]
REGEX = (?m)ancestry.com
DEST_KEY = queue
FORMAT = indexQueue

But the regex does not work. How to set the regex?

Tags (1)
0 Karma

eashwar
Communicator

Hello Bro,
the below configs will work for sure. i tested it in my splunk instance.

transforms.conf

[setnull]
REGEX = .+ancestry\.co\.uk.+
DEST_KEY = queue
FORMAT = nullQueue

dont forget to stop, clean, and start splunk after adding the configs. make sure the props.conf and transforms.conf are in the same local directory.

if this helped you, dont forget to vote!!

yours,

eashwar raghunathan

eashwar
Communicator

Hello brother,

you have to correct the way you are asking the question, you have mentioned extract events with the specific word.

it is clear form your comments that the concept you are trying to perform is FILTERING of data at INDEX TIME.

your regex looks good, just omit the (?m) it is not necessary. you feel your regex is not working is because your have added this configurations after you have indexed the data. you have to clean the index and reindex the logs.

Remove the (?m) from your regex, it is not necessary. actually i dont know what is (?m) i have never used it. you can explain to me in the comment why you have used it.

Procedure to clean your index and reindex

./splunk stop  
./splunk clean eventdata IndexName  
./splunk start

now splunk will clean all the data indexed in the specified indexname, and when you start splunk the data will get reindexed and the transforms.conf will apply to the newly indexed data.

Extraction are done in index time and search time. FILTERING is done in INDEX TIME not in Search time

i am also a new to splunk.

if you call a transform.conf variable using REPORT form props.conf it will do the extraction in search time.

if you call a transforms.conf variable using TRANSFORMS from props.conf it will do the extraction or routing or filtering in index time. you are performing filtering in indextime it is not extraction

try to clean the index and reindex again, dont forget to remove (?m). if you have some specific reason you dant have to remove it, and let me know the reason.

yours,

eashwar raghunathan

happy splunking

0 Karma

eashwar
Communicator

still not working, send me a sample log to eashwar@splunkconsultant.com. i will get back to you with the configs

0 Karma

eashwar
Communicator

Hi bro try this,
[setnull]
REGEX = (?i)ancestry.co.uk
DEST_KEY = queue
FORMAT = nullQueue

0 Karma

pdash
Path Finder

I am sending the unwanted data to null queue and rest to the index queue. So i tried to follow the splunk documentation that said to do it this way. The regex is where am not sure what exactly to do. I tried putting just ancestry.com but it doesnot do the trick. And am looking at the fresh data not the already indexed data.

0 Karma

pdash
Path Finder

So above is an event example which has ancestry.co.uk. Other such events might have ancestry.com. I want to extract only those events

0 Karma

pdash
Path Finder

4/2/13
10:42:32.000 AM
2013-04-02 16:42:32 10.6.15.159 GET /tree/15243411/person/252269850 - 46.33.71.68 Mozilla/5.0+(Windows+NT+5.1)+AppleWebKit/537.31+(KHTML,+like+Gecko)+Chrome/26.0.1410.43+Safari/537.31 HEADER.HINTS.COUNTEXPIRES=2+Apr+2013+16:47:51+UTC;+mbox=PC(referral)|utmcmd=referral|utmcct=/neo/launch;+s_vi=[CS]v1|26DACB7485160BD4-600001A0A03976E2[CE] http://trees.ancestry.co.uk/tree/15243411/family?cfpid=234793891&selnode=1 200 0 0 111279 2839
host=TREESUI04 Options| sourcetype=treesiis Options| source=d:\inetpub\logs\W3SVC1\u_ex13040210.log Options| date_mday=2

0 Karma

kristian_kolb
Ultra Champion

eashwar is correct on both counts.

On a side note, why use a (?m) regex for single-line events?

/k

0 Karma

eashwar
Communicator

please give us one sample event so that we can generate you a regular expression to extract the specific site name!!

0 Karma

eashwar
Communicator

you want to extract fields in search time or filter data in index time.
the above example of props and transforms are not for extracting it is will do the filtering at index time.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...