How can I capture these failures as timechart coun...

officialsubho19 · ‎10-05-2017

i have the following failures in the logs that i need to capture and show as timechart count by the type of errors , in a single dashboard .
Need help with framing the Query

UploadFile : Processing failed:
UploadFile : screen_error='Metadata file Transfer failed for'
UploadFile : status='failed', details='Metadata FTP failed. There is an orphan PDF on the system,
Caused by: java.lang.IllegalStateException: failed to connect

Caused by: java.lang.IllegalStateException: failed to create SFTP Session

SFTPServiceImpl : Failed to send file:

P.S ;- Just starting with splunk and having difficulty understanding splunk regular expressions . Need some links to interactive tutorials.

officialsubho19 · ‎10-05-2017

Hi yannK

i do not have a strong base search , so i would like to try the first option .
But what do you mean the ? in the rex expression

| rex "screen_error='(?Metadata file Transfer failed for)'"
| rex "Caused by: java.lang.IllegalStateException: (?failed to connect)"

Can you help me with some examples ?

yannK · ‎10-05-2017

The ( ) are the matching group, the fieldname between greaterthan/lessthan after the question mark, is a way to name the field to extract on the fly.

see http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Rex for examples.

yannK · ‎10-05-2017

Do you want to extract the "type of error" from the event strings directly ?
In that case, you can show us the final expected results.
Or are you trying to normalize the type of errors ?

Or do you want a simple overall count of errors for all those possible messages ?
in this case, a simple list of OR conditions can be enough.

Example of regex , plus normalization for :UploadFile : screen_error='Metadata file Transfer failed for'
You can do

 index=myindex source=mysource "screen_error" OR  "java.lang.IllegalStateException"
 | rex "screen_error='(?<errors_string>Metadata file Transfer failed for)'"
 | rex "Caused by: java.lang.IllegalStateException: (?<error_string>failed to connect)"
 | eval error_type=case(     
              match(error_string,"Metadata"),"Metadata error",
              match(error_string,"connect"),"Connection error",
              1=1, "unknown error")
  | timechart count by error_type

You can add more case options, and more rex extractions.
Remember that the rex will be successively extracted, to is a string match several rex, the last one will prevail.

PS : If you have a strong base search. You could use the "case" directly on the _raw field.

index=myindex source=mysource "screen_error" OR  "java.lang.IllegalStateException"
| eval error_type=case(     
           match(_raw,"Metadata"),"Metadata error",
           match(_raw,"failed to connect"),"Connection error",
           1=1, "unknown error")
| timechart count by error_type

How can I capture these failures as timechart count by type of error in a single dashboard?

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!