Solved: Help with stats for troubleshooting different resu...

responsys_cm · ‎06-30-2012

I have a table with the following fields:

table qualys_id,exploit_cve_id,exploit_name,exploit_source,exploit_url

Doing a dedup on exploit_cve_id,exploit_name and exploit_cve_id,exploit_url yields different results. I'm guessing that there are some data integrity issues. I would like to view a table with a count of both the exploit_name and the exploit_url appended to each result so I can sort them and try and figure out where the differences are.

So, the table would ideally be:

table qualys_id,exploit_cve_id,exploit_name,exploit_source,exploit_url,name_count,url_count

Is this possible?

Thx.

Craig

lguinn2 · ‎07-02-2012

I think so, but there are different ways to approach it. This search tries to count everything cross-tabulated with everything else. It will show you the syntax, but it may not really help with your problem.

yoursearchhere |
stats count(exploit_url) as exploit_url_count count(exploit_name) as exploit_name_count 
        dc(exploit_url) as exploit_url_unique dc(exploit_name) as exploit_name_unique
         by qualys_id exploit_cve_id exploit_name exploit_source exploit_url

The count(exploit...) functions count the number of events, while the dc(exploit...) functions count the number of unique values of the field.

The fields following the "by" are the fields that are used to break out the subtotals.

I just wonder if any of the counts will be greater than one, given the breakout.

Maybe one of these searches would be more useful to find weirdness:

yoursearchhere |
stats count by by qualys_id exploit_cve_id exploit_name exploit_source |
where count > 1

would show you all the ids that probably are associated with more than one exploit_url.

yoursearchhere |
stats count by by qualys_id exploit_cve_id exploit_url exploit_source |
where count > 1

would show you all the ids that probably are associated with more than one exploit_name. And so forth.

View solution in original post

lguinn2 · ‎07-02-2012

I think so, but there are different ways to approach it. This search tries to count everything cross-tabulated with everything else. It will show you the syntax, but it may not really help with your problem.

yoursearchhere |
stats count(exploit_url) as exploit_url_count count(exploit_name) as exploit_name_count 
        dc(exploit_url) as exploit_url_unique dc(exploit_name) as exploit_name_unique
         by qualys_id exploit_cve_id exploit_name exploit_source exploit_url

The count(exploit...) functions count the number of events, while the dc(exploit...) functions count the number of unique values of the field.

The fields following the "by" are the fields that are used to break out the subtotals.

I just wonder if any of the counts will be greater than one, given the breakout.

Maybe one of these searches would be more useful to find weirdness:

yoursearchhere |
stats count by by qualys_id exploit_cve_id exploit_name exploit_source |
where count > 1

would show you all the ids that probably are associated with more than one exploit_url.

yoursearchhere |
stats count by by qualys_id exploit_cve_id exploit_url exploit_source |
where count > 1

would show you all the ids that probably are associated with more than one exploit_name. And so forth.

Help with stats for troubleshooting different result sets

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!