Solved: Re: regex in subsearch

flaviadonno · ‎10-04-2013

Hi all,
I am trying to join 2 tables using a subsearch. The searches work as single search but not in the following subsearch format. Can you help me identify what is wrong?

index=* dataowner_id="BLA" stage=IT logtype="EL" 
| rex field=rec.msg ".*?TRANS: (?<trdate>\d{2}\.\d{2}\.\d{4} \d{2}:\d{2}:\d{2}) \w* (?<transaction>\d*) (?<userid>\w*) (?<cluster>\w*) (?<buid>\w*) (?<action>.*?) \]\]>" 
| search transaction=3456 AND action="special action" 
| convert timeformat="%d.%m.%Y %H:%M:%S" mktime(trdate) AS convtime 
| stats min(convtime) AS firsttime by userid 
| table userid firsttime 
| join type=outer userid [ search dataowner_id="BLA" stage=IT logtype="EL" 
    | rex field=rec.msg ".*?TRANS: (?<trdate>\d{2}\.\d{2}\.\d{4} \d{2}:\d{2}:\d{2}) \w* (?<transaction>\d*) (?<userid>\w*) (?<queue>.*?) \]\]>" 
    | search transaction=2345 
    | convert timeformat="%d.%m.%Y %H:%M:%S" mktime(trdate) AS convtime 
    | stats max(convtime) AS secondtime by userid 
    | table userid secondtime]

Thanks!

lguinn2 · ‎10-04-2013

How many events are returned by each search when you run them independently? You may be hitting the subsearch maximums.

I think this will be much faster and do the same thing:

index=* dataowner_id="BLA" stage=IT logtype="EL" 
| rex field=rec.msg ".*?TRANS: (?<trdate>\d{2}\.\d{2}\.\d{4} \d{2}:\d{2}:\d{2}) \w* (?<transaction>\d*) (?<userid>\w*) (?<cluster>\w*) (?<buid>\w*) (?<action>.*?) \]\]>" 
| eval eventStatus = case(transaction=3456 AND action="special action", "Start Time",
                      transaction=2345,"End Time",
                      1=1,"Other")
| where eventStatus != "Other"
| eval convTime = strptime(trdate,"%d.%m.%Y %H:%M:%S")
| stats min(convTime) AS firsttime max(convTime) as lasttime by userid eventStatus
| eval firsttime = if(eventStatus="Start Time",firsttime,null())
| eval lasttime = if(eventStatus="End Time",lasttime,null())
| stats min(firsttime) as firsttime max(lasttime) as lasttime  by userid

First, it performs the same initial search and field extraction. Then it categorizes each event based on the second search. Events that do not math either criteria are categorized as "Other" and then dropped.

In the next part of the processing, it calculates the epoch time for the trdate field and then calculates the minimum and maximum date for each user AND category. But since it only needs the firstttime for the starting category and the lasttime for the ending category, it sets the unneeded values to null.

Finally, it calculates the min and max time for each user. You don't need the table command.

Finally, rec.msg is not a valid Splunk field name. It may be accepting the name in some contexts, but not others.

View solution in original post

lguinn2 · ‎10-04-2013

How many events are returned by each search when you run them independently? You may be hitting the subsearch maximums.

I think this will be much faster and do the same thing:

index=* dataowner_id="BLA" stage=IT logtype="EL" 
| rex field=rec.msg ".*?TRANS: (?<trdate>\d{2}\.\d{2}\.\d{4} \d{2}:\d{2}:\d{2}) \w* (?<transaction>\d*) (?<userid>\w*) (?<cluster>\w*) (?<buid>\w*) (?<action>.*?) \]\]>" 
| eval eventStatus = case(transaction=3456 AND action="special action", "Start Time",
                      transaction=2345,"End Time",
                      1=1,"Other")
| where eventStatus != "Other"
| eval convTime = strptime(trdate,"%d.%m.%Y %H:%M:%S")
| stats min(convTime) AS firsttime max(convTime) as lasttime by userid eventStatus
| eval firsttime = if(eventStatus="Start Time",firsttime,null())
| eval lasttime = if(eventStatus="End Time",lasttime,null())
| stats min(firsttime) as firsttime max(lasttime) as lasttime  by userid

First, it performs the same initial search and field extraction. Then it categorizes each event based on the second search. Events that do not math either criteria are categorized as "Other" and then dropped.

In the next part of the processing, it calculates the epoch time for the trdate field and then calculates the minimum and maximum date for each user AND category. But since it only needs the firstttime for the starting category and the lasttime for the ending category, it sets the unneeded values to null.

Finally, it calculates the min and max time for each user. You don't need the table command.

Finally, rec.msg is not a valid Splunk field name. It may be accepting the name in some contexts, but not others.

lguinn2 · ‎10-08-2013

Thanks for the heads-up. I edited the answer, so hopefully it will run if you cut-and-paste it!

flaviadonno · ‎10-04-2013

Wow! Cool. There was only one ")" missing at the end of the case command. Otherwise, your search did the job. Thanks a lot for the fast and effective answer.

regex in subsearch

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases