Combine search results from one source in chart

mackiae · ‎05-02-2017

I am trying to build a visualization of change data to show over time the number of concurrent changes on going. So this bit I can do for a generic "All" categories of change or by any individual category in different charts. But the final requirement is to draw one chart with all 3 Categories shown distinctly. Here is where it goes wrong.

If I split the chart in the "All" search by "risk" (change category) the counts are very weird.
If I append the search for each category the numbers are correct, but the x-axis is repeated once for each time I append (??!)
If I Join the searches the x-axis is good, first search is good the others overlay a flat line based on the value of the first value on the x-axis.

Here is the search (in "Join" form) its complex!!!!

| inputlookup snow_change_subset.csv | search (Risk="Cat 1") AND (State != "Closed Complete") AND (State != "Cancelled")

| rename Start_date as Start End_date as Stop | eval Start=strptime(Start,"%d-%m-%Y %H:%M:%S"), Stop=strptime(Stop,"%d-%m-%Y %H:%M:%S") | eval start_time=relative_time(now(),"$Start_day$")

| eval futuretime=relative_time(start_time, "$Time_Range$") | eval Start = if(Startstart_time AND Startcount(Number), max(concurrent)-count(Number),0)) as "Cat 1"

| join [
| inputlookup snow_change_subset.csv | search (Risk="Cat 2") AND (State != "Closed Complete") AND (State != "Cancelled")

| rename Start_date as Start End_date as Stop | eval Start=strptime(Start,"%d-%m-%Y %H:%M:%S"), Stop=strptime(Stop,"%d-%m-%Y %H:%M:%S") | eval start_time=relative_time(now(),"$Start_day$")

| eval futuretime=relative_time(start_time, "$Time_Range$") | eval Start = if(Startstart_time AND Startcount(Number), max(concurrent)-count(Number),0)) as "Cat 2"]

Any ideas are welcome - the two variables for start_date and time range are whole days!

mackiae · ‎05-10-2017

Hi DalJeanis, Sorry for delayed response. I tried your suggested solution above and I can see the logic behind it, all makes sense. The accum command would not allow me to split by Risk in the version of Splunk we are running here. The counting is much closer that I got using the concurrency commands, and I am very sure that it is on the verge of getting the right answer out. But alas detailed analysis of the output show there are still inconsistencies. The sponsor has decided to mothball this for now as we have other tasks to deliver (It's her budget!) - it does not help that we discovered the input data quality is poor (being generous here!).

But thanks again for taking time to look at this and for sharing your knowledge and wisdom here. If we come back to it and finally get the right numbers out I will post how we did it.

Cheers,
Mackiae

DalJeanis · ‎05-02-2017

Here's a simplification strategy that we use when dealing with problems like this. We break each change into two transactions, a start transaction with deltaCount=+1 and an end transaction with deltaCount=-1. Sum(deltaCount), at any given point, is the number of concurrent items.

That means, in your case, we are only dealing with simple records that look like this:

| table _time Risk deltaCount

Here's a quick set of sample code to use as a base...

| inputlookup snow_change_subset.csv 
| (State != "Closed Complete") AND (State != "Cancelled") 
| eval Start=strptime(Start_date,"%d-%m-%Y %H:%M:%S") 
| eval Stop=strptime(End_date,"%d-%m-%Y %H:%M:%S") 
| eval TheTimes="Start=".Start." Stop=".Stop
| table Risk TheTimes
| makemv theTimes
| mvexpand TheTimes
| rex field=TheTimes "(?<startstop>[^=]+)=(?<time>\d+)"
| eval _time=time

| rename COMMENT as "Now, being paranoid, we make sure that we have whole days....;)"
| bin _time span=1d
| eval deltaCount=if(startstop="Start",1,-1)
| table _time Risk deltaCount

| rename COMMENT as "Now we need to make sure we have a record for each day"
| appendpipe 
  [| stats min(_time) as minDay max(_time) as maxDay 
   | eval TheDays=mvrange(minDay,maxDay+1,86400) 
   | mvexpand TheDays


   | rename COMMENT as "Warning-- these Risk values need to be formatted exactly as they are in your csv."
   | eval Risk="Cat 1!!!!Cat 2!!!!Cat 3"
   | makemv delim="!!!!" Risk
   | mvexpand Risk
   | eval deltaCount=0
   | rename TheDays as _time
   | table _time Risk deltaCount
  ] 

| rename COMMENT as "We roll them up and then turn them into a running total."
| stats sum(deltaCount) as deltaCount by _time Risk  
| accum deltaCount as ConcurrentCount by Risk
| table _time Risk ConcurrentCount 

| rename COMMENT as "And I think you meant to present only some future data range."
| eval start_time=relative_time(now(),"$Start_day$") 
| eval future_time=relative_time(start_time, "$Time_Range$") 
| where (_time >= start_time AND _time <= future_time)

| rename COMMENT as "And present the findings."
| timechart sum(ConcurrentCount) as count by Risk

Note - I don't know what the intention of this code is, but I believe I've rendered it unnecessary by the simplification strategy.

 | eval Start = if(Startstart_time AND Startcount(Number), max(concurrent)-count(Number),0)) as ...

DalJeanis · ‎05-02-2017

If you give us a quick mockup of fake sample data, and a mockup of the report you'd like to get out of it, then we might be able to help you better.

mackiae · ‎05-02-2017

DalJeanis - thanks for response, rich7177 asked for exactly the same extra info, I have provided a mock data sample and description of target output in response to rich7177's post, I hope that's good with you.

thanks mackiae

Richfez · ‎05-02-2017

Though I think you are OK here, please be sure to use the code button in the toolbar in the future to make sure the editor doesn't eat special characters in searches.

Could you provide a few lines of the source data (in snow_change_subset.csv) and an example of what you want the output to look like?

mackiae · ‎05-02-2017

rich7177 - sorry rookie mistake with the code button.

So here is a mock up of some data:

Cmdb_ci Number Risk Start_date End_date System 1 TCR0168200 Cat 3 23/03/2017 00:01 23/03/2017 06:00 System 2 MCR0189509 Cat 2 20/03/2017 20:00 22/03/2017 06:00 System 3 TCR0169422 Cat 3 23/04/2017 09:00 23/04/2017 23:00 System 4 MCR0189519 Cat 3 09/03/2017 08:00 10/03/2017 13:59 System 5 MCR0191511 Cat 3 10/03/2017 00:00 15/03/2017 23:59 System 6 MCR0182035 Cat 3 10/03/2017 00:00 12/03/2017 23:59 System 7 TCR0176674 Cat 2 12/03/2017 00:00 17/03/2017 23:59 System 8 MCR0183339 Cat 3 11/03/2017 00:00 12/03/2017 00:00 System 9 TCR0188875 Cat 3 17/03/2017 01:00 23/03/2017 23:00 System 10 MCR0175245 Cat 2 18/03/2017 03:01 18/03/2017 07:00 System 11 MCR0188973 Cat 3 10/03/2017 09:00 13/03/2017 23:59 System 12 TCR0188312 Cat 3 18/03/2017 05:00 22/03/2017 10:00 System 13 TCR0172233 Cat 3 08/03/2017 08:00 09/03/2017 17:00 System 14 MCR0140647 Cat 2 18/03/2017 13:00 21/03/2017 23:59 System 15 SCR0190466 Cat 3 13/03/2017 16:45 13/03/2017 20:45 System 16 MCR0103095 Cat 1 17/03/2017 00:01 18/03/2017 06:00 System 17 MCR0190389 Cat 3 10/03/2017 03:30 10/03/2017 15:30

I'm trying to get a line chart out with the risk being used for the series, date/time on x-axis and count by Risk on y-axis. I expect to get a line for each risk category. If I count the "Start_date" then I can get this to work, but it doesn't reflect the fact that a change runs from start to end - hence the concurrency and makecontinuous commands. Thanks for taking an interest.

Combine search results from one source in chart

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!