Splunk Search

Help with stats for troubleshooting different result sets

responsys_cm
Builder

I have a table with the following fields:

table qualys_id,exploit_cve_id,exploit_name,exploit_source,exploit_url

Doing a dedup on exploit_cve_id,exploit_name and exploit_cve_id,exploit_url yields different results. I'm guessing that there are some data integrity issues. I would like to view a table with a count of both the exploit_name and the exploit_url appended to each result so I can sort them and try and figure out where the differences are.

So, the table would ideally be:

table qualys_id,exploit_cve_id,exploit_name,exploit_source,exploit_url,name_count,url_count

Is this possible?

Thx.

Craig

Tags (1)
0 Karma
1 Solution

lguinn2
Legend

I think so, but there are different ways to approach it. This search tries to count everything cross-tabulated with everything else. It will show you the syntax, but it may not really help with your problem.

yoursearchhere |
stats count(exploit_url) as exploit_url_count count(exploit_name) as exploit_name_count 
        dc(exploit_url) as exploit_url_unique dc(exploit_name) as exploit_name_unique
         by qualys_id exploit_cve_id exploit_name exploit_source exploit_url

The count(exploit...) functions count the number of events, while the dc(exploit...) functions count the number of unique values of the field.

The fields following the "by" are the fields that are used to break out the subtotals.

I just wonder if any of the counts will be greater than one, given the breakout.

Maybe one of these searches would be more useful to find weirdness:

yoursearchhere |
stats count by by qualys_id exploit_cve_id exploit_name exploit_source |
where count > 1

would show you all the ids that probably are associated with more than one exploit_url.

yoursearchhere |
stats count by by qualys_id exploit_cve_id exploit_url exploit_source |
where count > 1

would show you all the ids that probably are associated with more than one exploit_name. And so forth.

View solution in original post

lguinn2
Legend

I think so, but there are different ways to approach it. This search tries to count everything cross-tabulated with everything else. It will show you the syntax, but it may not really help with your problem.

yoursearchhere |
stats count(exploit_url) as exploit_url_count count(exploit_name) as exploit_name_count 
        dc(exploit_url) as exploit_url_unique dc(exploit_name) as exploit_name_unique
         by qualys_id exploit_cve_id exploit_name exploit_source exploit_url

The count(exploit...) functions count the number of events, while the dc(exploit...) functions count the number of unique values of the field.

The fields following the "by" are the fields that are used to break out the subtotals.

I just wonder if any of the counts will be greater than one, given the breakout.

Maybe one of these searches would be more useful to find weirdness:

yoursearchhere |
stats count by by qualys_id exploit_cve_id exploit_name exploit_source |
where count > 1

would show you all the ids that probably are associated with more than one exploit_url.

yoursearchhere |
stats count by by qualys_id exploit_cve_id exploit_url exploit_source |
where count > 1

would show you all the ids that probably are associated with more than one exploit_name. And so forth.

Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...