Hello!
I am trying to create a pareto chart that contains a cumulative percentage line. My data consists of time tracking data in which resources charge time to their assigned tasks. I need to sum the total time spent grouped by task and chart it. I've managed to get the result that I want without grouping by the task name:
index="time_tracking" | eventstats sum("Total Hours by Day") as "Total Effort" | sort -"Total Hours by Day" | streamstats sum("Total Hours by Day") as "Cumulative Hours" | eval "Cumulative Percent"=('Cumulative Hours'/'Total Effort')*100 | table "Task Name", "Total Hours by Day", "Cumulative Percent"
It produces output that I can graph (subset below):
Task Name|Total Hours by Day|Cumulative Percent
Task 1|9|0.4659590991438
Task 2|9|0.9319181982876
Task 2|9|1.397877297431
Task 2|9|1.863836396575
Task 3|9|2.329795495719
Task 4|9|2.795754594863
Task 4|9|3.261713694007
Task 4|9|3.727672793150
My problem is that I don't know how to group this data by field "Task Name". Whenever I introduce a new stats command, the calculations break down:
stats sum("Total Hours by Day") as "Total Task Effort" by "Task Name"
I am looking for pointers as to how best to group the data and maintain my calculations. Any input would be greatly appreciated!
Thank you!
Andrew
Give this a try
index="time_tracking" | eventstats sum("Total Hours by Day") as "Total Effort" by "Task Name" | sort "Task Name",-"Total Hours by Day" | streamstats sum("Total Hours by Day") as "Cumulative Hours" by "Task Name" | eval "Cumulative Percent"=('Cumulative Hours'/'Total Effort')*100 | table "Task Name", "Total Hours by Day", "Cumulative Percent"
If this is not what you want, please share your sample data and expected result based off the same data.
Update
The expected output made it much easier to understand and here is the query that I think would work for you.
index="time_tracking"
| stats sum("Total Hours by Day") as "Total Hours by Day" by "Task Name"
| eventstats sum("Total Hours by Day") as "Total Effort" | sort -"Total Hours by Day" | streamstats sum("Total Hours by Day") as "Cumulative Hours" | eval "Cumulative Percent"=('Cumulative Hours'/'Total Effort')*100 | table "Task Name", "Total Hours by Day", "Cumulative Percent"
Give this a try
index="time_tracking" | eventstats sum("Total Hours by Day") as "Total Effort" by "Task Name" | sort "Task Name",-"Total Hours by Day" | streamstats sum("Total Hours by Day") as "Cumulative Hours" by "Task Name" | eval "Cumulative Percent"=('Cumulative Hours'/'Total Effort')*100 | table "Task Name", "Total Hours by Day", "Cumulative Percent"
If this is not what you want, please share your sample data and expected result based off the same data.
Update
The expected output made it much easier to understand and here is the query that I think would work for you.
index="time_tracking"
| stats sum("Total Hours by Day") as "Total Hours by Day" by "Task Name"
| eventstats sum("Total Hours by Day") as "Total Effort" | sort -"Total Hours by Day" | streamstats sum("Total Hours by Day") as "Cumulative Hours" | eval "Cumulative Percent"=('Cumulative Hours'/'Total Effort')*100 | table "Task Name", "Total Hours by Day", "Cumulative Percent"
Thank you for taking the time to respond! This is a step closer I think, but not exactly what I'm looking for. I've got some new ideas to work with. As requested, here is some sample data and my expected result:
Sample Data
Task Name|Total Hours by Day
Task 1|6
Task 2|4
Task 1|4
Task 4|4
Task 2|4
Task 1|4
Task 4|2
Task 2|4
Task 4|4
Task 2|4
Task 4|4
Task 2|4
Task 1|4
Task 3|9
Task 4|2
Task 2|1
Task 1|9
Task 3|4
Expected Output
Task Name|Total Hours by Day|Cumulative Percent
Task 1|27|35.1
Task 2|21|62.3
Task 4|16|83.1
Task 3|13|100
Basically it sums the "Total Hours by Day", grouped by "Task Name", and calculates the "Cumulative Percent" based on the sum of "Total Hours by Day" and "Total Hours by Day" for the given "Task Name".