I am trying to get two files milli seconds from one line and merchant id from another line from the same tomcat tread where milli seconds greater than 9000. below is the sample log
tomcat-http--25 [NTAUtil] parse request
tomcat-http--25 [RequestParser] API Version: urn:schemas-xxxxxxxxxx-com:transaction-data-1.60
tomcat-http--25 [NTAUtil] parse request 19000
tomcat-http--25 [TransformFilter] Transform Request 140
tomcat-http--25 [WSSecurityFilter] merchantID: james_mid; digestAlgorithm: null
From the above log, i need 19000 & james_mid in a table. if milli sec greater than 9000.
I tried this query on below log ( including time stamp as well). when i add the where cluase (Parse request)... "merchant ID" is not getting displayed. include " transaction threadID " ZERO results.
2016-04-21 19:38:00,079 INFO tomcat-http--25 [ABCUtil] parse request
2016-04-21 19:38:00,117 INFO tomcat-http--25 [RequestParser] API Version: urn:schemas-google-com:transaction-data-1.00001
2016-04-21 19:38:00,118 INFO tomcat-http--25 [ABCUtil] parse request 39
2016-04-21 19:38:00,118 INFO tomcat-http--25 [goooooogleFilter] Transform Request 140
2016-04-21 19:38:00,120 INFO tomcat-http--25 [googleeeeeeeeeeee] merchantID: james_mid; digestAlgorithm: null
I have used this query
Query 1 : No merchant ID
host=sl*ntaapp* source="/var/mware/nta/nta_service.log" NTAUtil OR WSSecurityFilter
| rex "tomcat-http--(?\d+)"
| rex "\[NTAUtil\] parse request (?\d+)"
| rex "merchantID: (?.*?);" | where parseRequest >= 1
| table threadID parseRequest merchantID
output
threadID parseRequest merchantID
62 5
102 13
36 85
65 5
111 5
36 85
53 162
Query 2 : zero records
host=sl*ntaapp* source="/var/mware/nta/nta_service.log"
| rex "tomcat-http--(?\d+)"
| rex "\[NTAUtil\] parse request (?\d+)"
| rex "merchantID: (?.*?);"
| transaction threadID
| where parseRequest >= 1
| table threadID parseRequest merchantID
am i missing anything here?
You need to use the formatting 101010 to for queries, otherwise the editor assumes that characters like underscore are formatting characters and it makes a mess of what you are posting.
And the rex you have for merchantID will definitely NOT work.
giving it a try now.
sukundur,
Could you check the additional comment I made yesterday and try some of those examples again? I think we're close and I'd hate to get so close yet not get final resolution on this.
Thanks,
Rich
Hi Rich & IGuinn
Thank you guys for looking at the issue.
Rich. I tried your queries and they provided the results i expected.
similar issues ( with treadID) were troubling me for a long time. this helps me fix thouse issues too.
I have to fix the query a bit so that each treadID (or result) is reported as one line( instead of grouping).
so when i export if does not report all in one cell ( as many of users has same Merchant ID).
i will see and let youknow.
thanks again to both of you.
Assuming the answers to my comment above are favorable.
First, let's extract your threadID from all lines, and extract "everything else" as another field.
... my root search ... | rex field=_raw "^(?<ThreadID>[^\s]+)\s+(?<RemainingMessage>.*)$"
That should extract two fields, ThreadID
and RemainingMessage
. Now that we have those, let's parse the two RemainingMessage
fields into the bits you need.
... everything above plus ...
| rex field=RemainingMessage "^\[NTAUtil\]\s+parse\s+request\s+(?<ParseTime>\d+)"
| rex field=RemainingMessage "merchantID:\s+(?<MerchantID>[^\s;]+)"
That should extract both ParseTime
and MerchantID
.
Now that we have all those pieces in place, we can group them in a variety of ways. You could use transaction, which is often more readable code, but in this case using stats is likely to be just as easy to understand and will be far faster.
... everything above plus ...
| stats values(MerchantID) as MerchantID, values(ParseTime) as ParseTime by ThreadID
| search ParseTime>9000 | table MerchantID, ParseTime, ThreadID
So, we take out the distinct values of MerchantID
and ParseTime
for each group of ThreadID
. I used values
instead of some other function just in case your threads overlap (after restart?). Then we search where our ParseTime is greater than your 9000 threshold, then output a pretty table.
So, all together for easy pasting...
... my root search ...
| rex field=_raw "^(?<ThreadID>[^\s]+)\s+(?<RemainingMessage>.*)$"
| rex field=RemainingMessage "^\[NTAUtil\]\s+parse\s+request\s+(?<ParseTime>\d+)"
| rex field=RemainingMessage "merchantID:\s+(?<MerchantID>[^\s;]+)"
| stats values(MerchantID) as MerchantID, values(ParseTime) as ParseTime by ThreadID
| search ParseTime>9000 | table MerchantID, ParseTime, ThreadID
Give that a try and see how that works.
When I use the snippet of data you provide down below (with the timestamps), I had to make a couple of small changes to get it to completely work:
host=sl*ntaapp* source="/var/mware/nta/nta_service.log" NTAUtil OR WSSecurityFilter
| rex field=_raw "INFO\s+(?<ThreadID>[^\s]+)\s+(?<RemainingMessage>.*)$"
| rex field=RemainingMessage "^\[NTAUtil\]\s+parse\s+request\s+(?<ParseTime>\d+)"
| rex field=RemainingMessage "merchantID:\s+(?<MerchantID>[^\s;]+)"
| stats values(MerchantID) as MerchantID, values(ParseTime) as ParseTime by ThreadID
| search ParseTime>30
You'll see that I changed the first rex because I had originally set it to start at the beginning of the string, but because the actual lines look like this:
2016-04-26 07:31:00,079 INFO tomcat-http--25 [NTAUtil] parse request
I changed it to look for "INFO " at the start to find the ThreadID. I also moved the search ParseTime>9000
(which I used 30
for because that's all the data I had) to AFTER the stats. If you do it before, you remove all the lines that ParseTime isn't in, then try to group the lines that are no longer there which doesn't work so well. My bad on the original way.
That gives results:
ThreadID MerchantID ParseTime
tomcat-http--25 james_mid 39
If you'd like to use the transaction method outlined by lguinn below, feel free to. Just remove the stats and search and replace it with the transaction, like so:
host=sl*ntaapp* source="/var/mware/nta/nta_service.log" NTAUtil OR WSSecurityFilter
| rex field=_raw "tomcat-http--(?<ThreadID>\d+)\s+(?<RemainingMessage>.*)$"
| rex field=RemainingMessage "^\[NTAUtil\]\s+parse\s+request\s+(?<ParseTime>\d+)"
| rex field=RemainingMessage "merchantID:\s+(?<MerchantID>[^\s;]+)"
| transaction ThreadID
| search ParseTime>30
| table ThreadID ParseTime MerchantID
If you'd prefer the "just the number" version of ThreadID, you could do something more akin to what lguinn did with the rexes:
host=sl*ntaapp* source="/var/mware/nta/nta_service.log" NTAUtil OR WSSecurityFilter
| rex field=_raw "tomcat-http--(?<ThreadID>\d+)\s+(?<RemainingMessage>.*)$"
| rex field=RemainingMessage "^\[NTAUtil\]\s+parse\s+request\s+(?<ParseTime>\d+)"
| rex field=RemainingMessage "merchantID:\s+(?<MerchantID>[^\s;]+)"
| stats values(MerchantID) as MerchantID, values(ParseTime) as ParseTime by ThreadID
| search ParseTime>30
Which gives nearly identical output as before, only with "ThreadID" of 25
instead of tomcat-http--25
. Your choice there.
Let us know how it goes!
Oh, yeah, the fantastic and talented lguinn has the optimization of making your search only return the two lines from each thread you need:
... my root search ... NTAUtil OR WSSecurityFilter | ...
That'll trim the number of events needing to be handled by (in your small example) by 60%, and I'm assuming there are a lot more lines per thread so the actual savings and speed increase should be FAR larger than that.
Assume, as @rich7177 suggested, that 25 is the thread number. Also assume that these three fields exist: threadID parseRequest and merchantID. Then the following will give you the table that you want, I believe
sourcetype=tomcat_or_whatever NTAUtil OR WSSecurityFilter
| transaction threadID
| where parseRequest > 9000
| table threadID parseRequest merchantID
If the fields do not exist, use this search, which also extracts the fields
sourcetype=tomcat_or_whatever NTAUtil OR WSSecurityFilter
| rex "tomcat-http--(?<threadID>\d+)"
| rex "\[NTAUtil\] parse request (?<parseRequest>\d+)"
| rex "merchantID: (?<merchantID>.*?);"
| transaction threadID
| where parseRequest > 9000
| table threadID parseRequest merchantID
Is that "thread 25?" If not, how can you tell which thread you are in?
Does the ms you need always come in the line [NTAUtil] parse request?
Is merchantID already being extracted (i.e. does it show up in your field list on the left)?