Show items which appear in clusters

viggor · ‎01-22-2018

I have a log file of the following sort:

 vendor productId clusterId
A        1         1
B        2         1
A        3         1
C        4         4
D        8         8
D        9         8
D        10       10

Now I would like to select those vendors who have a least one productId which is contained in a cluster of size at least k. The cluster size corresponds to the number of rows with the same clusterId.

So, in the example above, companies A and B both appear in a cluster of size 3, D in a cluster of size 2 and company C in a cluster of size 1.

In SQL I would solve this using sub-queries, but I am not sure how to tackle this in splunk.

somesoni2 · ‎01-23-2018

Try this (adjust where clause per your need)

your current search giving above output with fields vendor productId clusterId
| eventstats count as clusterSize by clusterId
| where clusterSize>k | stats count as productsCount by vendor

mayurr98 · ‎01-22-2018

how do you determine cluster size?I mean what is the logic to determine cluster size for the input you have given?

viggor · ‎01-23-2018

The cluster size corresponds to the number of rows with the same clusterId.

mayurr98 · ‎01-23-2018

I did not understand A and B both appear in a cluster of size 3, D in a cluster of size 2 and company C in a cluster of size 1.
I mean this does not match with no of rows with same cluster id . Kindly explain in detail.also put the corresponding size in a table..like at each row what will be the size

Show items which appear in clusters

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms