Dashboards & Visualizations

How to correlate log events in three different files and display as a dashboard report with drilldown?

bkumarm
Contributor

I have three different log files with entries as below sample.
Below is the description of what I am trying to achieve

  1. search file1 for occurance of "Error()" and the value "414d512050423120202020565d7c2320099b2a" - if exists then it is pass in application1 and proceed to search the other two files with 414d512050423120202020565d7c2320099b2a for error.
  2. search file1 for occurance of "Error("some message") with 414d512050423120202020565d7c2320099b2a - if yes then it is fail in application1 and proceed to search the other two files
  3. search the file2 for the occurances of 414d512050423120202020565d7c2320099b2a and list them
  4. search the file3 for occurances of 414d512050423120202020565d7c2320099b2a and if there is an entry mark them as failed in application2
  5. Display a dashboard combining relevant data such as time and status all the entries for 414d512050423120202020565d7c2320099b2a from the three files .
  6. when I click on the entry in dashboard 414d512050423120202020565d7c2320099b2a , it should take me to the log entry , and display entries from all the three files.

I have tried using combining the three files using "eventtype" and the searching. however I am not getting the intended output.
any suggestions?

-Bharath

File1:

Mon Nov 16 2015 08:26:51 [0x00350016][mgmt][notice] source-mq(CommonBinaryPassthrough): tid(111): Service installed on port,414d512050423120202020565d7c2320099b2a, Error(Invalid Data)
Mon Nov 16 2015 08:26:51 [0x00350014][mgmt][notice] source-mq(CommonMsgPassthrough): tid(111): Operational state up,414d5120505344322020202006f84f53cd9bb002, Error()
Mon Nov 16 2015 08:26:51 [0x00350014][mgmt][notice] mpgw(CommonBinaryPassthrough): tid(111): Operational state up,414d5120505344322020202006f84f53cd9bb002, Error()
Mon Nov 16 2015 08:26:51 [0x80e00344][mq][notice] mq-qm(Common_EAIT): tid(9171729): Connection succeeded,414d512050423120202020565d7c2320099b2a, Error(URL ….)
Mon Nov 16 2015 08:26:51 [0x80e00344][mq][notice] mq-qm(Common_EAIT): tid(9171633): Connection succeeded,414d51205042312020202055a1b9d422a6e502, Error()
Mon Nov 16 2015 08:26:51 [0x00350016][mgmt][notice] source-mq(CommonDataPassthrough): tid(111): Service installed on port, Error(" Could not get response")
Mon Nov 16 2015 08:26:51 [0x00350014][mgmt][notice] source-mq(CommonBinaryPassthrough): tid(111): Operational state up, Error(" Invalid Data found")

File2:

07:27:52.820',X'414d5120505344322020202006f84f53cd9bb002',X'414d51205042513157452055c28d792f442e8f',6,'TEST.MSG.TEST1'
2015-11-16 07:28:00.176457,'TEST.MSG.TEST2',NULL,X'414d512050423120202020565d7c2320099b2a',X'414d51205042513157452055c28d792f442e8f'
2015-11-16 07:28:00.178487,'TEST.MSG.TEST3',NULL,X'414d5120505344322020202006f84f53cd9bb002',X'414d512050425131574544312020202055c28d792f442e8f'
2015-11-16 07:28:02.709618,'TEST.MSG.TEST1',DATE '2015-11-16' GMTTIME '07:28:00.950',X'414d5120505344322020202006f84f53cd9bb002',X'414d51205042513157452055c28d792f442e8f',6
2015-11-16 07:28:04.066394,'TEST.REPLY',NULL,NULL,NULL,NULL,X'414d5120505344322020202006f84f53cd9bb002',X'414d51205042513157452055c28d792f442e8f'
2015-11-16 07:40:31.533186,'TEST.MSG.TEST1 '2015-11-16' GMTTIME '07:40:31.510',X'414d51205042312020202055a1b9d422a6e502',X'000000000000000000000000000000000000000000000000',4,''

File3:

******* MessageID: X'414d5120505344322020202006f84f53cd9bb002' *******
******* 2015-11-03 11:26:29.663561 TEST.MSG.TEST1 *******
( ['APP1' : 0x11d1f2b0]
  (0x01000000:Name):RecoverableException = (
    (0x03000000:NameValue):File                 = '/mypath/Comptest.cpp' (CHARACTER)
    (0x03000000:NameValue):Line                 = 497 (INTEGER)
    (0x03000000:NameValue):Function             = 'test' (CHARACTER)
    (0x03000000:NameValue):Type                 = 'TestNode' (CHARACTER)
    (0x03000000:NameValue):Name                 = 'TEST_MSG' (CHARACTER)
    (0x03000000:NameValue):Label                = 'TEST_MSG' (CHARACTER)
    (0x03000000:NameValue):Catalog              = 'msgs' (CHARACTER)
    (0x03000000:NameValue):Severity             = 3 (INTEGER)
    (0x03000000:NameValue):Number               = 2230 (INTEGER)
    (0x03000000:NameValue):Text                 = 'Caught exception and rethrowing' (CHARACTER)

******* MessageID: X'414d51205042312020202055a1b9d422a6e502' *******
******* 2015-11-03 11:26:45.663461 TEST.MSG.TEST2 *******
( ['APP2' : 0x11d1f2b1]
  (0x01000000:Name):RecoverableException = (
    (0x03000000:NameValue):File                 = '/mypath/Comptest.cpp' (CHARACTER)
    (0x03000000:NameValue):Line                 = 497 (INTEGER)
    (0x03000000:NameValue):Function             = 'test' (CHARACTER)
    (0x03000000:NameValue):Type                 = 'TestNode' (CHARACTER)
    (0x03000000:NameValue):Name                 = 'TEST_MSG' (CHARACTER)
    (0x03000000:NameValue):Label                = 'TEST_MSG' (CHARACTER)
    (0x03000000:NameValue):Catalog              = 'msgs' (CHARACTER)
    (0x03000000:NameValue):Severity             = 3 (INTEGER)
    (0x03000000:NameValue):Number               = 2230 (INTEGER)
    (0x03000000:NameValue):Text                 = 'Caught another exception and rethrowing' (CHARACTER)
0 Karma
1 Solution

sundareshr
Legend

Sounds like the end result is a dashboard of sorts. If that is true, does all of this have to be in one search query? Why not break it up into multiple queries and display results in dashboard panels. You could display the pass/fail in a single value panel and the events below them. If this is acceptable, you could do something like this

Add in input field to enter (414d512050423120202020565d7c2320099b2a) token=id

search file1 for occurrence of "Error()" and the value "414d512050423120202020565d7c2320099b2a" - if exists then it is pass else fail.

source=log3 $id$  | eval status=if(match(_raw, "Error\(\)"), "Pass", "Fail") | table status

search the file2 for the occurances of 414d512050423120202020565d7c2320099b2a and list them

source=log3 $id$

search the file3 for occurances of 414d512050423120202020565d7c2320099b2a and if there is an entry mark them as failed in application2.

source=logs3  $id$ | stats count | eval status=if(count>0, "Pass", "Fail") | table status

Display a dashboard combining relevant data such as time and status all the entries for 414d512050423120202020565d7c2320099b2a from the three files . when I click on the entry in dashboard 414d512050423120202020565d7c2320099b2a , it should take me to the log entry , and display entries from all the three files.

source=logs* $id$

Now, if these must be in one search. you could do something like this

Sounds like the end result is a dashboard of sorts. If that is true, does all of this have to be in one search query? Why not break it up into multiple queries and display results in dashboard panels. You could display the pass/fail in a single value panel and the events below them. If this is acceptable, you could do something like this

Add in input field to enter (414d512050423120202020565d7c2320099b2a) token=id

search file1 for occurrence of "Error()" and the value "414d512050423120202020565d7c2320099b2a" - if exists then it is pass else fail.

source=log3 $id$  | eval status=if(match(_raw, "Error\(\)"), "Pass", "Fail") | table status

search the file2 for the occurances of 414d512050423120202020565d7c2320099b2a and list them

source=log3 $id$

search the file3 for occurances of 414d512050423120202020565d7c2320099b2a and if there is an entry mark them as failed in application2.

source=logs3  $id$ | stats count | eval status=if(count>0, "Pass", "Fail") | table status

Display a dashboard combining relevant data such as time and status all the entries for 414d512050423120202020565d7c2320099b2a from the three files . when I click on the entry in dashboard 414d512050423120202020565d7c2320099b2a , it should take me to the log entry , and display entries from all the three files.

source=logs* $id$

Now, if you must have all this in one search, you could try something like this (please tweak appropriately)

source=logs1 414d5120505344322020202006f84f53cd9bb002 | eval id="414d5120505344322020202006f84f53cd9bb002" | stats count as a1r by id
| appendcols [search source=logs3  414d5120505344322020202006f84f53cd9bb002 | eval id="414d5120505344322020202006f84f53cd9bb002" | stats count as a2r]
| append [search  source=logs* 414d5120505344322020202006f84f53cd9bb002 | eval e=_raw | table e ]

View solution in original post

sundareshr
Legend

Sounds like the end result is a dashboard of sorts. If that is true, does all of this have to be in one search query? Why not break it up into multiple queries and display results in dashboard panels. You could display the pass/fail in a single value panel and the events below them. If this is acceptable, you could do something like this

Add in input field to enter (414d512050423120202020565d7c2320099b2a) token=id

search file1 for occurrence of "Error()" and the value "414d512050423120202020565d7c2320099b2a" - if exists then it is pass else fail.

source=log3 $id$  | eval status=if(match(_raw, "Error\(\)"), "Pass", "Fail") | table status

search the file2 for the occurances of 414d512050423120202020565d7c2320099b2a and list them

source=log3 $id$

search the file3 for occurances of 414d512050423120202020565d7c2320099b2a and if there is an entry mark them as failed in application2.

source=logs3  $id$ | stats count | eval status=if(count>0, "Pass", "Fail") | table status

Display a dashboard combining relevant data such as time and status all the entries for 414d512050423120202020565d7c2320099b2a from the three files . when I click on the entry in dashboard 414d512050423120202020565d7c2320099b2a , it should take me to the log entry , and display entries from all the three files.

source=logs* $id$

Now, if these must be in one search. you could do something like this

Sounds like the end result is a dashboard of sorts. If that is true, does all of this have to be in one search query? Why not break it up into multiple queries and display results in dashboard panels. You could display the pass/fail in a single value panel and the events below them. If this is acceptable, you could do something like this

Add in input field to enter (414d512050423120202020565d7c2320099b2a) token=id

search file1 for occurrence of "Error()" and the value "414d512050423120202020565d7c2320099b2a" - if exists then it is pass else fail.

source=log3 $id$  | eval status=if(match(_raw, "Error\(\)"), "Pass", "Fail") | table status

search the file2 for the occurances of 414d512050423120202020565d7c2320099b2a and list them

source=log3 $id$

search the file3 for occurances of 414d512050423120202020565d7c2320099b2a and if there is an entry mark them as failed in application2.

source=logs3  $id$ | stats count | eval status=if(count>0, "Pass", "Fail") | table status

Display a dashboard combining relevant data such as time and status all the entries for 414d512050423120202020565d7c2320099b2a from the three files . when I click on the entry in dashboard 414d512050423120202020565d7c2320099b2a , it should take me to the log entry , and display entries from all the three files.

source=logs* $id$

Now, if you must have all this in one search, you could try something like this (please tweak appropriately)

source=logs1 414d5120505344322020202006f84f53cd9bb002 | eval id="414d5120505344322020202006f84f53cd9bb002" | stats count as a1r by id
| appendcols [search source=logs3  414d5120505344322020202006f84f53cd9bb002 | eval id="414d5120505344322020202006f84f53cd9bb002" | stats count as a2r]
| append [search  source=logs* 414d5120505344322020202006f84f53cd9bb002 | eval e=_raw | table e ]

bkumarm
Contributor

alt text

this is the view I am trying to get in later on.

0 Karma

bkumarm
Contributor

I needed all this in one search and your suggestion worked fine when I know the string 414d5120505344322020202006f84f53cd9bb002 before . My problem is that I get this from File1 at runtime. They are generated by application1 and passed to application2 , application3 for keeping track of message.
so I have to extract ID at runtime from File1 and then search for it in other two.

0 Karma

sundareshr
Legend

Well in that case, you will have to extract the fields from each of the log files. Here are a couple of regex that could help

'(?<id>\w+)' this will get the id field from log3

,(?<id>\w{30,}), this wil get id from log1 & 2

.*\((?<emsg>[^\)]*) this will get  the message string from logs 1 & 2

Then the following query should get your what you are looking for

index=* sourcetype=* source=log1 id=* | eval a1=if(len(emsg)>0, "f", "p") | table id, a1 | join type=outer id [search index=* sourcetype=* source=log3 | eval a2="f"] | table id, a1, a2 | sort id | fillnull value="p"
0 Karma

bkumarm
Contributor

you have given me a way to get the number or occurances . that is useful . however
I tried a lot , but still could not reach to : Here is what I am trying out to get

Transaction Status

time 414d5120505344322020202006f84f53cd9bb002 Pass
time 414d512050423120202020565d7c2320099b2a Fail
time 414d51205042312020202055a1b9d422a6e502 Fail

TransactionFail App1       TransactionFail App2

time 414d5120505344322020202006f84f53cd9bb002

time 414d512050423120202020565d7c2320099b2a

time 414d51205042312020202055a1b9d422a6e502

0 Karma

bkumarm
Contributor

Due apologies, I missed to mention that I am searching on multiple files of each type, i,e. there are 10 files of type File1, 5 of type File2 and 1 of type File3

0 Karma

Richfez
SplunkTrust
SplunkTrust

OK, that helps. I'm getting a feel for the generic problem you are trying to really solve, and what it is that you are really after. (And hopefully this will help some others understand better too).

The remainder of this comment only sort of applies to you, and it's NOT meant to be critical, just explanation. So don't take it poorly because it's certainly not meant to be taken that way. It's more like some "generic explanation" that's probably the start of a good blog post some day. 🙂

When we have these rather large scope problems we have to start somewhere. The starts that are made are sometimes right and sometimes wrong, but in either case they help define the problem set better. Sometimes it won't be apparent why we're asking something to be tried - usually explanations come after we decide if it is a good basis to continue working from or not.

In my mind we come up with a spectrum of answers between two main problem sets (outside of technical issues or syntax fixing.)

At one end of the spectrum is "I have this data and I have some vague idea about something to do with it, please help me." Those are often far easier because there's a lot of flexibility in the answer. The answers can sometimes be totally unlike what they originally expected, but they love the answer anyway because it's great and solves their problem well.

The other end of the spectrum is "I have these reports we manually generate from tools X, Y and Z, I need to create that exact same report for Management only in Splunk now." Those can be very difficult - I remember one time where the person was complaining about the font used in the footer because they couldn't make it match the old report. The problem there for our part in trying to find a solution is a lack of flexibility.

Your problem is somewhere in between. It seems you have a very specific output you'd like in a format you want. Are you replacing another tool's capabilities? This is fine, we totally like doing that. But it make the answers much harder because they now have to be "a certain right answer", not just "a right answer."

So, please continue bearing with us - we can probably get you where you are going, it just may take us a bit of trying and fiddling around before we really get going down the right path.

0 Karma

bkumarm
Contributor

I am not replacing an existing tool, but I am trying to do a POC for log analytics.
The requirement is to present in a dashboard the overall statistics of events in each log file and status message for each transaction based on the ID that gets attached to it.
second requirement is to present a table that has correlation between logs and when I select a transaction ID it should take me to the search page which lists all the transactions,

then ofcourse there is a need for displaying few graphs .
I am able to show graphs and statistics ...however I am struggling with tables , specially combining multiple tables into one

0 Karma

sundareshr
Legend

@bkumarm can you put the final layout you are looking for in a spreadsheet and post a screenshot. I am not sure I understand how you would like the final layout to be. From the above sample, it appears, you would like one table to show Pass/Fail for Application 1. Then a second table that shows For each ID, Pass/Fail by Application. Are these ID unique? From you sample data, they don't appear to be unique. How do you want to show if there's more than one occurrence of the same ID?

0 Karma

bkumarm
Contributor

I am not allowed to upload file/photo because of point restrictions.
if you can ping me on bharath dot kumar19 at wipro dot com I can provide more details.

0 Karma

Richfez
SplunkTrust
SplunkTrust

How many of these are we talking about, too? Is it a couple of MessageIDs per day, or thousands per second? (I'm guessing in between!).

Also, your latest description of the requirements has got me thinking - very vaguely at this point, but thinking nevertheless - that perhaps there's a visualization that could help. Let me look around a bit this weekend (if I can't get to it sooner). This may be one of those things that if it works would probably just be able to be layered on top of whatever solution we're slowly working our way towards here, so by all means keep working through this.

I don't have much time today, but sundareshr, one of my thoughts this morning was to do just transaction them together, then follow with a eval case to find out which "file" each GUID is in. I also thought it may be possible to search the error ones in file1, then append stats count by file in the rest. See if those thoughts trigger anything in your own mind.

0 Karma

sundareshr
Legend
index=* sourcetype=* source=*1 id=* | dedup id | eval a1=if(len(emsg)>0, "Fail", "Pass") | append [search index=* sourcetype=* source=*2 id=* | dedup id | eval a2="Fail"] | table _time, id, a1, a2 | transaction id keepevicted=t | table _time, id, a1, a2 | fillnull value="Pass" | eval s=if(match(tostring(a1+a2), "Fail"), "Fail", "Pass") | table _time id, s, a1, a2 | sort _time

appears to be the winner 🙂

0 Karma

bkumarm
Contributor

Agreed, and that has got as Excellent appreciation in today's management presentation as well.
Thank you all ..Special thanks for Sundaresh.

answer to Rich: the target is to process millions of transactions per day from different users coming into this app and display them based on User and transaction ID.
log size is currently about 1GB per day in one app and it is growing.
so it is useful to create such views in different colors and trigger events on failures too.

0 Karma

renjith_nair
SplunkTrust
SplunkTrust

You should extract the ids as fields to write better search.

Looking at your raw data, below search should work for you. You might need to change it according to your requirement.

index=* source="file1" OR source="file2" OR source="file3" earliest=-1h
|stats count(eval(like(_raw, "%Error%414d512050423120202020565d7c2320099b2a%"))) as count by source
|eval status=if(count>0,"Fail","Pass")
Happy Splunking!
0 Karma

Richfez
SplunkTrust
SplunkTrust

renjith.nair: I second your notion to get some proper extractions going. That and fixing the line breaking in file3 is going to be critical to making everything else flow.

0 Karma

bkumarm
Contributor

File3 does not have a line break. that is mistake while pasting it here.
I have created a common eventtype that combines the three files for searching.
however the error string is foung only in File1.
I have also set field extractions and I am able to get these fields independently from each file.
what I am failing is to correlate between them and generate a common table with Status based on error message.
I have also created Field aliasing. Finally I saved the search output from File1 into a lookup test.csv file.
however I am not clear how to use this for searching in other two files

0 Karma

bkumarm
Contributor

Your answer does not provide me the expected solution. the value 414d512050423120202020565d7c2320099b2a is runtime generated ..hence we have extract it dynamically
secondly, First I want search for Error and if it failed i.e. there is come data within Error(....)
I want to use the extracted ID values to search in other files and then display them in a single table

0 Karma

bkumarm
Contributor

Thanks Rich,
Answers to your question on break in the third file:
NO, the log does not break, that line was my mistake while posting it. The log is continuous. however the log structure remains same.
What I am trying to do is create a table that displays time, message(without duplicate) , status (based on the search described earlier) and may be few more related fields.
from this table if the user clicks on a particular message ID, it should list all log entries with that ID ( from all three files) . currently it goes to the search result which has all entries.
hope this answers your doubt.

Renjith,
I will try your suggestion and update.

Thanks,
Bharath

0 Karma

Richfez
SplunkTrust
SplunkTrust

This is quite a bit for one question. Let me break it into different sized an shaped chunks, ask a few questions and see if this makes sense:

You have this data in Splunk already, right? How are the events broken? My guess is file1 and file2 are probably pretty reasonably cut into events. I'm not sure how Splunk would handle file3 - does it break your example that yuo gave into two events, one for each MessageID? Or does it break stuff into bunches of smaller lines? The former will work, the latter will be less than optimal so that's the first thing we need to confirm and/or fix.

Once linebreaking is straightened out, then we need to extract some things and create fields for what probably won't be automatically recognized. You'll need error messages, and .... stuff. Other things. Rex is fun, though, so this will be fun too!

All that grouping and "if this then search that" stuff in your description - that's all just your current idea on how to do it, right? Because I'd just group everything on that MessageID using stats or transaction and use a couple of evals to write in "pass" or "fail" or whatever. We'll have to get to this after we get the previous stuff straightened out, but it shouldn't be terribly difficult once the groundwork has been laid.

Once those are all done, the dashboard and drilldown won't be difficult - it'll just be the search we generated above saved in a dashboard panel. I don't do custom drilldowns often, in fact I haven't done one yet, but I'm pretty sure I've stumbled across that before and it didn't look hard. Probably easy-peasy, just like the rex stuff!

With that in mind, then, first step - can you check the event breaking on file3?

0 Karma

bkumarm
Contributor

Thanks Rich,
Answers to your question on break in the third file:
NO, the log does not break, that line was my mistake while posting it. The log is continuous. however the log structure remains same.
What I am trying to do is create a table that displays time, message(without duplicate) , status (based on the search described earlier) and may be few more related fields.
from this table if the user clicks on a particular message ID, it should list all log entries with that ID ( from all three files) . currently it goes to the search result which has all entries.
hope this answers your doubt.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...