Splunk Search

Creating a timechart for a query with percentages and a join.

clehw
Explorer

I am trying to create a timechart for a query that returns a count for a set of products that where it's lifecycle status is either compliant or our of compliance. the count is then used to create a percentage. The query returns the two counts (which i don't care about) and the associated percentage for each (which is what do want to get into a time chart for the past 90 days. I have the search working but have not been able to figure out how to get the percents (two lines, one chart) into a timechart. Below is my search any help would be appreciated. What i have so far does return a count of events but nothing in a chart and the search itself says no results found

index="index" sourcetype=productversion ((removablemedia_win OSType="Windows*" AND LifeCycleStatus!="NewVersion") OR NOT ProductName="")
| stats count(LifeCycleStatus) AS lifecycletotal
| join type=outer OSType [search index="index" sourcetype=productversion (NOT ProductName="
" (OSType="Windows*")) OR (removablemedia_win AND (OSType="Windows*") AND (LifeCycleStatus="Mainstream" OR LifeCycleStatus="Emerging")) | stats count(LifeCycleStatus) AS IsCompliant]
| eval Compliant=(IsCompliant/lifecycletotal)*100
| eval Compliant=round(Compliant,2)
| eval NonCompliant=(100-Compliant)
| eval NonCompliant=round(NonCompliant,2)
| timechart span=1d first(Compliant) as Compliant first(NonCompliant) as NonCompliant

Labels (2)
Tags (1)
0 Karma
1 Solution

rmmiller
Contributor

The proper way to do this with Splunk is to write your initial search to capture all the products that are both compliant and non-compliant. After getting all items in one search, use eval to identify items that are compliant before finally piping through timechart to make shiny graphs. Although join can be used to do this, it's not efficient, and you're extremely likely to run into search limits of join if you're going back 90 days in an environment as big as the one you're in. I did. 😉

If I understand your 2 searches correctly, you have everything you need in the results from your first search. There shouldn't be anything in the 2nd search (where you're looking for Mainstream or Emerging products) that isn't already present in the results of your first search. We might need to tweak a couple of things if that isn't quite true, but this should be a very close approximation of what you're after.

Answer:

index="index" sourcetype=productversion ((`removablemedia_win` OSType="Windows*" AND LifeCycleStatus!="NewVersion") OR NOT ProductName="*") 
| eval Compliant=if((match(LifeCycleStatus,"(Mainstream|Emerging)") AND match(OSType,"^Windows")) OR ((NOT isnull(ProductName)) AND match(OSType,"^Windows")),"True","False")
| timechart count(eval(Compliant=="True")) AS CompliantCount count AS TotalCount span=1d 
| eval CompliantPercent=round(100*CompliantCount/TotalCount,2), NonCompliantPercent=round(100-CompliantPercent,2)
| fields _time,CompliantPercent,NonCompliantPercent

Mocked-Up Data And Graph

I mocked-up your data using gentimes, evals, and some other tricks. Do let this confuse you -- it's there only to reproduce what I think your search results look like.

| gentimes start=01/01/2020 increment=30m
| eval _time=starttime
| eval LFStat=split("Emerging,Mainstream,Mainstream,Declining,Sunset,Other",",")
| eval LifeCycleStatus=mvindex(LFStat,random()%6)
| eval OSes=split("Windows 7,Windows 10,Windows Server 2008R2,Windows Server 2012R2,Windows Server 2016,RHEL7,Ubuntu 18.04,Android 10.0",",")
| eval OSType=mvindex(OSes,random()%8)
| eval ProdName=split("ProductA,ProductB,ProductC",",")
| eval ProductName=mvindex(ProdName,random()%4)
| eval RmvMed=split("True,False",",")
| eval removeablemedia_win_macro_result=mvindex(RmvMed,random()%2)
| table _time,OSType,LifeCycleStatus,ProductName,removeablemedia_win_macro_result
`comment("Mocked-up sample data with credit to to4kawa")`
| eval Compliant=if((match(LifeCycleStatus,"(Mainstream|Emerging)") AND match(OSType,"^Windows")) OR ((NOT isnull(ProductName)) AND match(OSType,"^Windows")),"True","False")
`comment("The 2 eval statements below just help you understand the logic of the above eval")`
| eval IsProductNull=if(isnull(ProductName),"True","False")
| eval IsProductNotNull=if(NOT isnull(ProductName),"True","False")
`comment("Let timechart do all the heavy lifting for you")`
| timechart count(eval(Compliant=="True")) AS CompliantCount count AS TotalCount span=1d 
`comment("The compute your percentages on the back-end")`
| eval CompliantPercent=round(100*CompliantCount/TotalCount,2), NonCompliantPercent=round(100-CompliantPercent,2)
| fields _time,CompliantPercent,NonCompliantPercent

If you don't follow all of that, don't sweat it. I'll break it down below.
Oh yeah, since I don't have your exact data, the percentage graphs in this example are going to be pretty jumpy compared to the ones you'll probably get with your real data. That's just because I made my data up using random numbers to test the logic.


Explanation

Skip down to the line that reads "comment("Mocked-up sample data with credit to to4kawa")" The stuff before that line is only to generate sample data. The stuff after that line is the logic of what you want to do -- identify compliant machines using your rules.

| eval Compliant=if((match(LifeCycleStatus,"(Mainstream|Emerging)") AND match(OSType,"^Windows")) OR ((NOT isnull(ProductName)) AND match(OSType,"^Windows")),"True","False")

With this line, we're defining a new field named Compliant that mimics the effect of your joined search by applying the same logic to get to True/False. We're essentially flagging the results that meet your compliance criteria so we can use them up in a timechart function to build the graphs.

I'll admit I didn't fully understand what your intent was with the ProductName searching. On one hand, I think you're trying to get non-null values, but one of your searches also looked at the OS Version, and I got confused. Neither of these eval statements (IsProductNull and IsProductNotNull) is important for your search. They are there only for understanding, and we drop them later, anyway. This might help you determine whether your logic is correct. That's not something I can help with since I don't know your data or your rules. 🙂

| eval IsProductNull=if(isnull(ProductName),"True","False")
| eval IsProductNotNull=if(NOT isnull(ProductName),"True","False")

Moving on...
Timechart has the wonderful built-in capability to use (nearly all) the stats functions, and it can be told to compute over each day. Use it to tally the search results that meet your compliance rules for each day.

| timechart count(eval(Compliant=="True")) AS CompliantCount count AS TotalCount span=1d 

But wait, you wanted percentages! No problem. Just eval those into existence like this

| eval CompliantPercent=round(100*CompliantCount/TotalCount,2), NonCompliantPercent=round(100-CompliantPercent,2)

Last step is to keep only the fields you want in your timechart..._time, CompliantPercent, NonCompliantPercent

| fields _time,CompliantPercent,NonCompliantPercent

Switch over to your visualization tab, set it to whatever format suits your needs, and you should be done. No join necessary.


If you don't follow this, I recommend taking my mock-up query, removing the timechart line and everything after it, and making sure the eval logic for the Compliant field produces accurate results for your needs. Get the logic right (especially with the ProductName fields), and the rest will work fine.

Hope that helps!
rmmiller

Edit: Just a touch-up on formatting the eval of the percentages. It wasn't intending properly.

View solution in original post

clehw
Explorer

"Hope that helps!" ok well... my response is WOW! That is above and beyond and the breakdown was super helpful.

I'll just clarify the one question you had on notnull. Actually with that I am searching for null in addition to, reason being is I have identified systems that show up in 3 different environments which and include macs, windows workstations and servers that do not show up in another environment(epo). So I want them to be considered not compliant. In order to accomplish that I add the machine names in the sql database with their host name, ostype and last seen from the other environments. Once they land in splunkville they will have a null product name. I only do it that way because I am not that experienced with splunk and it seemed the easiest way to get them included without adding another joint to another sourcetype. If I'm not mistaken using NOT ProductName="*" includes those hosts that do not contain a productname.

The breakdown was terrific and let me just say you are so right about efficiency my search can take a couple minutes to complete while yours takes easily less than a minute and that was hitting on 6M events for 60 days searching over 34 records! Super impressed with the efficiency of the way you do that without the join!!!

Thanks tons!!!

0 Karma

rmmiller
Contributor

The proper way to do this with Splunk is to write your initial search to capture all the products that are both compliant and non-compliant. After getting all items in one search, use eval to identify items that are compliant before finally piping through timechart to make shiny graphs. Although join can be used to do this, it's not efficient, and you're extremely likely to run into search limits of join if you're going back 90 days in an environment as big as the one you're in. I did. 😉

If I understand your 2 searches correctly, you have everything you need in the results from your first search. There shouldn't be anything in the 2nd search (where you're looking for Mainstream or Emerging products) that isn't already present in the results of your first search. We might need to tweak a couple of things if that isn't quite true, but this should be a very close approximation of what you're after.

Answer:

index="index" sourcetype=productversion ((`removablemedia_win` OSType="Windows*" AND LifeCycleStatus!="NewVersion") OR NOT ProductName="*") 
| eval Compliant=if((match(LifeCycleStatus,"(Mainstream|Emerging)") AND match(OSType,"^Windows")) OR ((NOT isnull(ProductName)) AND match(OSType,"^Windows")),"True","False")
| timechart count(eval(Compliant=="True")) AS CompliantCount count AS TotalCount span=1d 
| eval CompliantPercent=round(100*CompliantCount/TotalCount,2), NonCompliantPercent=round(100-CompliantPercent,2)
| fields _time,CompliantPercent,NonCompliantPercent

Mocked-Up Data And Graph

I mocked-up your data using gentimes, evals, and some other tricks. Do let this confuse you -- it's there only to reproduce what I think your search results look like.

| gentimes start=01/01/2020 increment=30m
| eval _time=starttime
| eval LFStat=split("Emerging,Mainstream,Mainstream,Declining,Sunset,Other",",")
| eval LifeCycleStatus=mvindex(LFStat,random()%6)
| eval OSes=split("Windows 7,Windows 10,Windows Server 2008R2,Windows Server 2012R2,Windows Server 2016,RHEL7,Ubuntu 18.04,Android 10.0",",")
| eval OSType=mvindex(OSes,random()%8)
| eval ProdName=split("ProductA,ProductB,ProductC",",")
| eval ProductName=mvindex(ProdName,random()%4)
| eval RmvMed=split("True,False",",")
| eval removeablemedia_win_macro_result=mvindex(RmvMed,random()%2)
| table _time,OSType,LifeCycleStatus,ProductName,removeablemedia_win_macro_result
`comment("Mocked-up sample data with credit to to4kawa")`
| eval Compliant=if((match(LifeCycleStatus,"(Mainstream|Emerging)") AND match(OSType,"^Windows")) OR ((NOT isnull(ProductName)) AND match(OSType,"^Windows")),"True","False")
`comment("The 2 eval statements below just help you understand the logic of the above eval")`
| eval IsProductNull=if(isnull(ProductName),"True","False")
| eval IsProductNotNull=if(NOT isnull(ProductName),"True","False")
`comment("Let timechart do all the heavy lifting for you")`
| timechart count(eval(Compliant=="True")) AS CompliantCount count AS TotalCount span=1d 
`comment("The compute your percentages on the back-end")`
| eval CompliantPercent=round(100*CompliantCount/TotalCount,2), NonCompliantPercent=round(100-CompliantPercent,2)
| fields _time,CompliantPercent,NonCompliantPercent

If you don't follow all of that, don't sweat it. I'll break it down below.
Oh yeah, since I don't have your exact data, the percentage graphs in this example are going to be pretty jumpy compared to the ones you'll probably get with your real data. That's just because I made my data up using random numbers to test the logic.


Explanation

Skip down to the line that reads "comment("Mocked-up sample data with credit to to4kawa")" The stuff before that line is only to generate sample data. The stuff after that line is the logic of what you want to do -- identify compliant machines using your rules.

| eval Compliant=if((match(LifeCycleStatus,"(Mainstream|Emerging)") AND match(OSType,"^Windows")) OR ((NOT isnull(ProductName)) AND match(OSType,"^Windows")),"True","False")

With this line, we're defining a new field named Compliant that mimics the effect of your joined search by applying the same logic to get to True/False. We're essentially flagging the results that meet your compliance criteria so we can use them up in a timechart function to build the graphs.

I'll admit I didn't fully understand what your intent was with the ProductName searching. On one hand, I think you're trying to get non-null values, but one of your searches also looked at the OS Version, and I got confused. Neither of these eval statements (IsProductNull and IsProductNotNull) is important for your search. They are there only for understanding, and we drop them later, anyway. This might help you determine whether your logic is correct. That's not something I can help with since I don't know your data or your rules. 🙂

| eval IsProductNull=if(isnull(ProductName),"True","False")
| eval IsProductNotNull=if(NOT isnull(ProductName),"True","False")

Moving on...
Timechart has the wonderful built-in capability to use (nearly all) the stats functions, and it can be told to compute over each day. Use it to tally the search results that meet your compliance rules for each day.

| timechart count(eval(Compliant=="True")) AS CompliantCount count AS TotalCount span=1d 

But wait, you wanted percentages! No problem. Just eval those into existence like this

| eval CompliantPercent=round(100*CompliantCount/TotalCount,2), NonCompliantPercent=round(100-CompliantPercent,2)

Last step is to keep only the fields you want in your timechart..._time, CompliantPercent, NonCompliantPercent

| fields _time,CompliantPercent,NonCompliantPercent

Switch over to your visualization tab, set it to whatever format suits your needs, and you should be done. No join necessary.


If you don't follow this, I recommend taking my mock-up query, removing the timechart line and everything after it, and making sure the eval logic for the Compliant field produces accurate results for your needs. Get the logic right (especially with the ProductName fields), and the rest will work fine.

Hope that helps!
rmmiller

Edit: Just a touch-up on formatting the eval of the percentages. It wasn't intending properly.

clehw
Explorer

"Hope that helps!" ok well... my response is WOW! That is above and beyond and the breakdown was super helpful.

I'll just clarify the one question you had on notnull. Actually with that I am searching for null in addition to, reason being is I have identified systems that show up in 3 different environments which include macs, windows workstations and servers that do not show up in another environment(epo). So I want them to be considered not compliant. In order to accomplish that I add the machine names in the sql database with their host name, ostype and last seen from the other environments. Once they land in splunkville they will have a null product name. I only do it that way because I am not that experienced with splunk and it seemed the easiest way to get them included without adding another join to another sourcetype. If I'm not mistaken using NOT ProductName="*" includes those hosts that do not contain a productname.

The breakdown was terrific and you are so right about efficiency! My search can take a couple minutes to complete while yours takes easily less than a minute and that was hitting on 6M events for 60 days searching over 34M records! Super impressed with the efficiency of the way you do that without the join!!!

Thank you so much this was a great answer to the question and educational as well!

Carly

0 Karma

rmmiller
Contributor

Hi Carly,

Great! I'm glad this helped you out!

Yes, NOT ProductName="*" should capture the null values. If the blank ProductName field is non-compliant, then I think the precise logic of the eval Compliant probably needs a little tweaking. I'm guessing something like this might be more appropriate for Windows OSs:

| eval Compliant=if((match(LifeCycleStatus,"(Mainstream|Emerging)") AND match(OSType,"^Windows")) AND (NOT isnull(ProductName)),"True","False")

Have a good one!
rmmiller

0 Karma

rmmiller
Contributor

2 questions from a former Allstater to current Allstater 🙂

1) Does the subsearch on your join actually produce any results when it is run independently?
search index=be03_mcafee sourcetype=productversion (NOT ProductName="" (OSType="Windows*")) OR (removablemedia_win AND (OSType="Windows*") AND (LifeCycleStatus="Mainstream" OR LifeCycleStatus="Emerging")) | stats count(LifeCycleStatus) AS IsCompliant

2) You're looking for the percent in/out of compliance for each day over the past 90 days, correct?

0 Karma

clehw
Explorer

Hello to you as well my former Allstater. 🙂

Yes correct, The sub search does return correct results when run independently and i am looking for the result to yield a line for each compliant and not compliant in the same chart over the 90 day period.

note there are asterisks inside the empty quotes... not sure they are showing correctly.

0 Karma

rmmiller
Contributor

OK, I think I understand. Join isn't what you need to do this. Working on something for you that I believe will get you what you're after.
(Edit: removed comment about streamstats -- was thinking about a different question!) 🙂

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...