Hi I want to extract events that have a specific site name in the raw data. How to extract these events?
Here are my props.conf and transforms.conf
[iis]
TRANSFORMS-set= setnull,setparsing
[setnull]
REGEX = .
DEST_KEY = queue
FORMAT = nullQueue
[setparsing]
REGEX = (?m)ancestry.com
DEST_KEY = queue
FORMAT = indexQueue
But the regex does not work. How to set the regex?
Hello Bro,
the below configs will work for sure. i tested it in my splunk instance.
transforms.conf
[setnull]
REGEX = .+ancestry\.co\.uk.+
DEST_KEY = queue
FORMAT = nullQueue
dont forget to stop, clean, and start splunk after adding the configs. make sure the props.conf and transforms.conf are in the same local directory.
if this helped you, dont forget to vote!!
yours,
eashwar raghunathan
Hello brother,
you have to correct the way you are asking the question, you have mentioned extract events with the specific word.
it is clear form your comments that the concept you are trying to perform is FILTERING of data at INDEX TIME.
your regex looks good, just omit the (?m) it is not necessary. you feel your regex is not working is because your have added this configurations after you have indexed the data. you have to clean the index and reindex the logs.
Remove the (?m) from your regex, it is not necessary. actually i dont know what is (?m) i have never used it. you can explain to me in the comment why you have used it.
Procedure to clean your index and reindex
./splunk stop
./splunk clean eventdata IndexName
./splunk start
now splunk will clean all the data indexed in the specified indexname, and when you start splunk the data will get reindexed and the transforms.conf will apply to the newly indexed data.
Extraction are done in index time and search time. FILTERING is done in INDEX TIME not in Search time
i am also a new to splunk.
if you call a transform.conf variable using REPORT form props.conf it will do the extraction in search time.
if you call a transforms.conf variable using TRANSFORMS from props.conf it will do the extraction or routing or filtering in index time. you are performing filtering in indextime it is not extraction
try to clean the index and reindex again, dont forget to remove (?m). if you have some specific reason you dant have to remove it, and let me know the reason.
yours,
eashwar raghunathan
happy splunking
still not working, send me a sample log to eashwar@splunkconsultant.com. i will get back to you with the configs
Hi bro try this,
[setnull]
REGEX = (?i)ancestry.co.uk
DEST_KEY = queue
FORMAT = nullQueue
I am sending the unwanted data to null queue and rest to the index queue. So i tried to follow the splunk documentation that said to do it this way. The regex is where am not sure what exactly to do. I tried putting just ancestry.com but it doesnot do the trick. And am looking at the fresh data not the already indexed data.
So above is an event example which has ancestry.co.uk. Other such events might have ancestry.com. I want to extract only those events
4/2/13
10:42:32.000 AM
2013-04-02 16:42:32 10.6.15.159 GET /tree/15243411/person/252269850 - 46.33.71.68 Mozilla/5.0+(Windows+NT+5.1)+AppleWebKit/537.31+(KHTML,+like+Gecko)+Chrome/26.0.1410.43+Safari/537.31 HEADER.HINTS.COUNTEXPIRES=2+Apr+2013+16:47:51+UTC;+mbox=PC(referral)|utmcmd=referral|utmcct=/neo/launch;+s_vi=[CS]v1|26DACB7485160BD4-600001A0A03976E2[CE] http://trees.ancestry.co.uk/tree/15243411/family?cfpid=234793891&selnode=1 200 0 0 111279 2839
host=TREESUI04 Options| sourcetype=treesiis Options| source=d:\inetpub\logs\W3SVC1\u_ex13040210.log Options| date_mday=2
eashwar is correct on both counts.
On a side note, why use a (?m)
regex for single-line events?
/k
please give us one sample event so that we can generate you a regular expression to extract the specific site name!!
you want to extract fields in search time or filter data in index time.
the above example of props and transforms are not for extracting it is will do the filtering at index time.