Splunk Search

Correlate STUCK and UNSTUCK weblogic threads

dbcase
Motivator

Hi,

I have the below events. What I need to do is correlate the execute thread (the 2nd one) with a STUCK message. That part is easy enough, where I get stumped is now correlating further with a unstuck message. What I'm hoping to have at the end is a table that shows the URL (in this example it is GET /rest/icontrol/sites/72178/rules HTTP/1.1 then a column with the word STUCK then another column with the word UNSTUCK or blank.

My effort so far is:

index=cox STUCK|rex "GET\s(?<URL>\S+)"|rex "\[STUCK] ExecuteThread:\s'(?<threadID>\S+)[']"|dedup threadID host|stats count by URL host threadID|sort host threadID

This just extracts out the URL and threadID from Stuck threads and does a simple table. I'm stuck on the matching up the correlating UNSTUCK message

Event data:

1/23/17
11:02:50.000 PM 
####<Jan 23, 2017 11:02:50 PM EST> <Info> <WebLogicServer> <ccivirpxa0721> <managedServer12> <[ACTIVE] ExecuteThread: '20' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1485230570155> <BEA-000339> <[ACTIVE] ExecuteThread: '20' for queue: 'weblogic.kernel.Default (self-tuning)' has become "unstuck".> 
host =  portal2 index = linecount = 1 source =  /var/nfs/SAT_SplunkLogs/weblogic/portal2/Portal2_managedServer12.log00477.zip:./managedServer12.log00477 sourcetype =   wls_managedserver splunk_server =   idx6.icontrol.splunkcloud.com
1/23/17
11:02:40.000 PM 
####<Jan 23, 2017 11:02:40 PM EST> <Info> <WebLogicServer> <ccivirpxa0721> <managedServer12> <[ACTIVE] ExecuteThread: '9' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1485230560319> <BEA-000339> <[ACTIVE] ExecuteThread: '9' for queue: 'weblogic.kernel.Default (self-tuning)' has become "unstuck".> 
host =  portal2 index = linecount = 1 source =  /var/nfs/SAT_SplunkLogs/weblogic/portal2/Portal2_managedServer12.log00477.zip:./managedServer12.log00477 sourcetype =   wls_managedserver splunk_server =   idx6.icontrol.splunkcloud.com
1/23/17
11:01:58.000 PM 
####<Jan 23, 2017 11:01:58 PM EST> <Error> <WebLogicServer> <ccivirpxa0721> <managedServer12> <[ACTIVE] ExecuteThread: '26' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1485230518746> <BEA-000337> <[STUCK] ExecuteThread: '20' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "652" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 652073 ms
[
GET /rest/icontrol/sites/72178/rules HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT 6.0; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
DNT: 1
X-format: json
X-ClientInfo: 7.3.8.77
Referer: https://portal.company.com/sp/
Cookie: JSESSIONID=F9XOmTyCrkis7WDdsku5tZO09U0_te9pCpjfHxhsAIqM30KIB53j!-695380872
Via: 1.1 10.210.192.38
X-Forwarded-For: 10.210.192.5
X-Forwarded-Host: portal.company.com
X-Forwarded-Server: 10.210.192.38
Connection: Keep-Alive
]", which is more than the configured time (StuckThreadMaxTime) of "600" seconds. Stack trace:
<snip>
0 Karma
1 Solution

lguinn2
Legend

Try this

index=cox "stuck" OR "unstuck"
| rex "GET\s(?<URL>\S+)"
| rex "\[(?<threadStatus>[STUCK|ACTIVE])\] ExecuteThread:\s'(?<threadID>\S+)[']"
| eval timestamp=strftime(_time."%x %X")
| eval threadStatus=if(threadStatus=="ACTIVE","Unstuck",threadStatus)
| sort _time
| stats list(timestamp) as Time list(threadStatus) as "Thread Status" by host threadID

It may not be the format that you asked for, but I think it will work!

View solution in original post

0 Karma

lguinn2
Legend

Try this

index=cox "stuck" OR "unstuck"
| rex "GET\s(?<URL>\S+)"
| rex "\[(?<threadStatus>[STUCK|ACTIVE])\] ExecuteThread:\s'(?<threadID>\S+)[']"
| eval timestamp=strftime(_time."%x %X")
| eval threadStatus=if(threadStatus=="ACTIVE","Unstuck",threadStatus)
| sort _time
| stats list(timestamp) as Time list(threadStatus) as "Thread Status" by host threadID

It may not be the format that you asked for, but I think it will work!

0 Karma

dbcase
Motivator

HI Iguinn,

Hmmm I think you have made it MUCH closer but the rex for threadStatus isn't working

0 Karma

dbcase
Motivator

Found it this "\[(?<threadStatus>[STUCK|ACTIVE])\] needed to be modified to this "\[(?<threadStatus>[STUCK|ACTIVE]+)\]

Now sorting thru the results, will let you know if there are any other modifications....

THANKS!!!

0 Karma

lguinn2
Legend

good catch on the regular expression!

0 Karma

dbcase
Motivator

Hi Iguinn,

Based upon your query (thank you!) I modified it to get what I was hoping for. The end result looks like this

index=cox stuck OR unstuck  | rex "GET\s(?<URL>\S+)"  | rex "(?<threadStatus>(STUCK|unstuck))"| rex "(?:.*?ExecuteThread:\s'){2}(?<threadID>\S+)[']"  | eval timestamp=strftime(_time,"%x %X")| sort _time| dedup threadID host _time| stats list(timestamp) as Time list(threadStatus) as "Thread Status" by host threadID|sort host threadID
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...