Getting Data In

How to get a table which gives the top 10 error messages with host in a graph

dileepsri9
Engager

Hi All,

I am a fresher to Splunk and I am trying to create a graph which has the top 10 error messages in each host. Can you please let me know how to do this, I have tried this, but this is not working..

sourcetype=websphere ERROR | stats count by EventCode, host | sort limit=0 host, - count
| streamstats count as top by host
| where top <= 10
| stats list(EventCode) as EventCode, list(count) as count by host | timechart max(EventCode)

can you please help me with this

0 Karma

woodcock
Esteemed Legend

I think that you'll like this; most people do not know it but top can use a BY clause, too:

sourcetype=websphere ERROR | top 10 EventCode BY host

DalJeanis
Legend

@woodcock - yes, I'm always dubious when doing | sort (non-zero-number) but can never pull the right head / sort / top syntax in the heat of the aircode.

0 Karma

DalJeanis
Legend

There are a number of structural issues with the request.

A) The top 10 for each host might be a different ten on each day or for each host, leading to no limit on the number of overall codes that would have to be tracked.

B) You can't do a timechart without a time, and timechart is going to present the data spread over time, so the vertical (the colored line or bar) would have to represent both the host and the error code, in which case you have hundreds of lines and no decent way to visually interpret the results.

Here are a couple of examples of what you CAN do, given what you have...


Use this to get, by host, a count of the top 10 error messages overall

sourcetype=websphere ERROR 
| stats count as EventCount by EventCode, host 

| rename COMMENT as "Determine the top 10 EventCodes overall and mark them, killing everything else"
| appendpipe 
    [| stats sum(EventCount) as EventTotal by EventCode 
     | sort 10 - EventTotal 
     ]
| eventstats max(EventTotal) as EventTotal by EventCode 
| where isnotnull(EventTotal) AND isnotnull(EventCount)
| fields - EventTotal

| chart sum(EventCount) as count by host EventCode 

Use this to get, across time, by host, a count of the top 2 error messages overall

sourcetype=websphere ERROR 
| bin _time span=1d
| stats count as EventCount by EventCode, host, _time 

| rename COMMENT as "Determine the top two EventCodes of all time and mark them, killing everything else"
| appendpipe 
    [| stats sum(EventCount) as EventTotal by EventCode 
     | sort 2 - EventTotal 
     ]
| eventstats max(EventTotal) as EventTotal by EventCode 
| where isnotnull(EventTotal) AND isnotnull(EventCount)
| fields - EventTotal


| rename COMMENT as "Determine the top ten hosts with those EventCodes and mark them, killing everything else"
| appendpipe 
    [| stats sum(EventCount) as EventTotal by host 
     | sort 10 - EventTotal 
     ]
| eventstats max(EventTotal) as EventTotal by host 
| where isnotnull(EventTotal) AND isnotnull(EventCount)
| fields - EventTotal


| rename COMMENT as "Combine host and EventCode to make a single field named series"
| eval series = host." Error ".EventCode
| timechart  sum(EventCount) as count by series

richgalloway
SplunkTrust
SplunkTrust

Are you looking for error messages (text) or error codes (integers)? You current query looks like it's looking for the latter.

---
If this reply helps you, Karma would be appreciated.
0 Karma

dileepsri9
Engager

I am looking for error codes in text; ex: Error reported: 503

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...