Dashboards & Visualizations

Need help to visualize the query

pench2k19
Explorer

Hi Guys,

I have the following query and query result, i am struggling to show it in graph:

index=infra_apps sourcetype=ca:atsys:edemon:txt
| search Job=*
| rename hostname as host 
| eval time_epoch=strftime(_time,"%Y-%m-%d %H:%M:%S") 
| fields Job host Autosysjob_time Status _time time_epoch 
| lookup datalakenodeslist.csv host OUTPUT cluster
| mvexpand cluster
| table Job Status host cluster _time time_epoch
| search cluster=* AND host=*
| sort + time_epoch
| stats count  by _time Job Status host cluster time_epoch
| bin span=2m time_epoch
| makecontinuous  _time span=2m
| filldown _time Job Status host cluster count time_epoch

Query result:

_time   Job Status  host    cluster time_epoch  count
3/3/2020 8:00   1CDH_ING_NBC_ACCT_MSTR_DY_CURR_HG                       STARTING    XXXX    edl-prd-m01 43893.33337 1
3/3/2020 8:00   1CDH_ING_NBC_ACCT_OB_PRIM_CK_DY_TMPRL_BMG     STARTING  XXXX    edl-prd-m01 43893.33338 1
3/3/2020 8:00   1CDH_ING_NBC_EVNT_CUST_ID_CHG_HY_HIST_CIS        RUNNING    XXXXX   edl-prd-m01 43893.33372 1
3/3/2020 8:00   1CDH_ING_NBC_EVNT_CUST_PH_CHG_HY_HIST_CIS        RUNNING    XXXX    edl-prd-m01 43893.33372 1
3/3/2020 9:00   1CDH_ING_NBC_EVNT_CUST_PH_CHG_HY_HIST_CIS        RUNNING    XXXX    edl-prd-m01 43893.33372 1

Now i am struggling to show how many jobs are running or starting at each minute , can you please help

Tags (2)
0 Karma

woodcock
Esteemed Legend

You should take a look at the concurrency command and the answers.splunk.com articles about it.

0 Karma

sideview
SplunkTrust
SplunkTrust

I think the core problem, or at the very least a major source of confusion, is that you're taking _time, which is already an integer value (the number of seconds since big ben rang out the new year in 1970), and turning it into a string-formatted time, naming that field "time_epoch" which is SUPER confusing because its values are string-formatted times, not epochtime integers.

Then sorting by that string formatted time. This is a little odd but so far not really a problem - just weird since the original _time value can do all that perfectly well too.

Wild guess - you're getting confused by the behavior of the Splunk UI - whenever it sees any field called "_time" it sneakily sneaks in and on-the-fly converts the display of those values to string-formatted times. However the values underneath are epochtime integers. you can always verify that I'm not talking crazy by doing | eval srsly_wat=_time

Then anyway, back to our story the problem arises with this -
| bin span=2m time_epoch

that would make sense if time_epoch was actually an epochtime-valued field, ie if it held integer values. It makes no sense however in this case where the "time_epoch" field holds strings. Bin has no idea what to do with this combination of field and value, and so what it does is -- nothing at all. it throws no error but it does nothing.

I think if you fix that core issue, you might be able to revisit what you're trying to do with the bin+makecontinuous +filldown. re-examining that fresh, with some fresh caffeine, is a good idea.

 index=infra_apps sourcetype=ca:atsys:edemon:txt
 | search Job=*
 | rename hostname as host 
 | fields Job host Autosysjob_time Status _time 
 | lookup datalakenodeslist.csv host OUTPUT cluster
 | mvexpand cluster
 | table Job Status host cluster _time 
 | search cluster=* AND host=*
 | sort + _time
 | stats count  by _time Job Status host cluster 

This is bit of an instinct, but I have seen many cases where people where folks are fiddling with bin+makecontinuous+filldown basically trying to recreate one of timechart's core use cases, beacuse they've come to believe that timechart can't do it.

I hope this helps - I don't think it's your whole answer, but I think it clears some of the mess up and puts you in a better place. and i'll update this answer if you post back etc.

0 Karma

pench2k19
Explorer

I just got a query derived for no of running jobs at each time as follows

index=XXX sourcetype=YYYY EventCode=40 AND (Status=Failure OR Status=Terminated OR Status=Running OR Status=Success) AppID=$appid$ Machine=$host$ Job=$job$
| dedup _raw
| lookup datalakenodeslist.csv host as Machine OUTPUT cluster
| search cluster=$clustername$
| eval running=if(Status="RUNNING","1","0"),status=if(Status="RUNNING","start","stop"), time=_time+status
| bin span=2m _time
| stats max(running) as running, earliest(time) as first, latest(time) as last by Job,_time
| xyseries _time Job running first last
| makecontinuous span=2m _time
| streamstats window=2 global=f earliest(last*) as last*
| reverse
| streamstats window=2 global=f earliest(first*) as first*
| reverse
| foreach running*
[ eval <>=if(isnull('<>') AND like('first<>',"%start"),"0",if(isnull('<>') AND like('first<>',"%stop"),"1",if(isnull('<>') AND like('last<>',"%start"),"1",if(isnull('<>') AND like('last<>',"%stop"),"0",'<>'))))]
| fields - first*, last*
| filldown *
| reverse
| filldown *
| reverse
| addtotals fieldname=Running
| fields _time,Running

But i find it too difficult to just understand the query :(, and not able to edit this query to have single query for both RUNNING and STARTING statuses.

Can you please help to have a single query for both the statuses.

0 Karma

pench2k19
Explorer

@sideview thanks for such detailed explanation.

The following the is the latest query i am using
index=infra_apps sourcetype=ca:atsys:edemon:txt
| search Job=*
| rename hostname as host
| eval t1=strftime(_time,"%Y-%m-%d %H:%M:%S")
| eval time_epoch=strptime(t1,"%Y-%m-%d %H:%M:%S")
| fields Job host Autosysjob_time Status _time time_epoch t1
| lookup datalakenodeslist.csv host OUTPUT cluster
| mvexpand cluster
| table Job Status host cluster _time time_epoch
| search cluster=EDL-PRD-M01 AND host=*
| sort + time_epoch
| stats count by _time Job Status host cluster time_epoch
| bin span=2m _time
| makecontinuous span=2m _time
| filldown _time Job Status host cluster count time_epoch

Sorting the resutls before stats command with _time was not working for me , thats why i had to use time_epoch.

And i tried using | bin span=2m time_epoch and | bin span=2m _time but these are giving me different results at times , i still confused what to use on this line.

P S: Example of how _time being extracted in this case is 2020-03-09 10:00:03.863

0 Karma

pench2k19
Explorer

@richgalloway this is not helping....

I just want to give you some back ground about this..I have a single event for each job status in splunk

But to make it continuous i am inserting 2 mins interval using bin and makecontinous command and filling those intervals with the last status it updated,

For example if its running status at 9 am and success status at 10 am, i am creating 2m interval between that 1 hour gap and filling it with Running status.

I suspect that is one of the reason i am not able to get that in graph as i am expexting.

PS: I am getting the correct result till the filldown command, after that i am struggling.

0 Karma

pench2k19
Explorer
index=infra_apps sourcetype=ca:atsys:edemon:txt
| search Job=*
| rename hostname as host 
| eval time_epoch=strftime(_time,"%Y-%m-%d %H:%M:%S")
| fields Job host Autosysjob_time Status _time time_epoch
| lookup datalakenodeslist.csv host OUTPUT cluster 
| mvexpand cluster
| table Job Status host cluster _time time_epoch
| search cluster=EDL-PRD-M01 AND host=*
| sort + time_epoch
| stats count  by _time Job Status host cluster time_epoch
| bin span=2m _time
| makecontinuous span=2m _time
| filldown _time Job Status host cluster count time_epoch
| stats count by _time Status

this is giving me the correct result, but i am not able have a filter in the query like search Status=RUNNING for example, When i add the status filter its giving me different result.

@woodcock @jka can you please help.

0 Karma

pench2k19
Explorer

@woodcock @jkat54 @somesoni2 @sideview can you please help

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Try adding | timechart span=1m count by Status to the end of your query.

---
If this reply helps you, Karma would be appreciated.
0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...