Solved: mvcount and stats count give different results

viggor · ‎01-18-2018

I have a log file where each line has an itemId and a clusterId.
When I run the following sort of queries

| stats count(itemId) as clusterSize by clusterId 
| sort - clusterSize

vs

| stats list(itemId) AS items BY clusterId 
| eval clusterSize=mvcount(items) 
| sort -clusterSize

and get different results. I don't know if it's a coincidence but the second command results in largest clusterSizes of exactly 100.

Does anybody have an idea?

mporath_splunk · ‎01-18-2018

Per the Splunk documentation, list() Returns a list of up to 100 values of the field X as a multivalue entry.

View solution in original post

cmerriman · ‎01-18-2018

the list command only returns 100 field values. if there are more than 100 values of itemId, this is why there is that problem in the second query.
http://docs.splunk.com/Documentation/SplunkCloud/6.6.3/SearchReference/CommonStatsFunctions#Supporte...

if you're looking for a total count of itemIds by clusterId, the first query works great, if you want to know how many unique itemIds are in each clusterId, try |stats dc(itemId) as clusterSize by clusterId

mporath_splunk · ‎01-18-2018

Per the Splunk documentation, list() Returns a list of up to 100 values of the field X as a multivalue entry.

mayurr98 · ‎01-18-2018

hey

list(X)
Returns a list of up to 100 values of the field X as a multivalue entry. The order of the values reflects the order of input events.

have a look in this official doc http://docs.splunk.com/Documentation/Splunk/7.0.1/SearchReference/Multivaluefunctions#list.28X.29

so your first query output is correct while your second query results in largest clusterSizes of exactly 100 because of its limit (gives wrong output) and that is why there is a mismatch.

let me know if this helps !

mvcount and stats count give different results

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes