Splunk Search

Avg function doesn't output any value when used along with "by"

rjuliani
New Member

Hi everyone!

I'm trying to get some useful stats on my logged data. I have 3 attributes in each log entry, HARVEST_DATE, PUBLISH_DATE and MEDIA_TYPE. I would like to calculate the average difference between PUBLISH_DATE and HARVEST_DATE, as a timestamp (miliseconds).

To do so, I'm executing the following command:

index=sfmc-ris TYPE=PUSH_POST_ASYNC SUCCESS=true
| eval PUBLISH_DATE_TIMESTAMP=strptime(PUBLISH_DATE, "%Y-%m-%d %H:%M:%S,%3Q") 
| eval HARVEST_DATE_TIMESTAMP=strptime(HARVEST_DATE, "%Y-%m-%d %H:%M:%S,%3Q") 
| eval HARVEST_TO_PUBLISH=round(HARVEST_DATE_TIMESTAMP-PUBLISH_DATE_TIMESTAMP, 1) 
| stats avg(HARVEST_TO_PUBLISH) as HARVEST_PUBLISH_AVG by MEDIA_TYPE

However, that outputs no results at all for HARVEST_PUBLISH_AVG. If I remove the by MEDIA_TYPE part, it will show me the average difference without any grouping, no issues.

Can anyone help me figure out what am I doing wrong?

Thanks!

0 Karma

woodcock
Esteemed Legend

Although you surely have events for which isnotnull(HARVEST_PUBLISH_AVG) is true and also events for which isnotnull(MEDIA_TYPE) is true, you must not have any events for which both are true. You can join them together with the stats command (and also many other commands) if you have another field (e.g. host or sessionID or DOCUMENTID) like this and it should work:

index=sfmc-ris TYPE=PUSH_POST_ASYNC SUCCESS=true
| stats values(*) AS * by DOCUMENTID
| eval PUBLISH_DATE_TIMESTAMP=strptime(PUBLISH_DATE, "%Y-%m-%d %H:%M:%S,%3Q") 
| eval HARVEST_DATE_TIMESTAMP=strptime(HARVEST_DATE, "%Y-%m-%d %H:%M:%S,%3Q") 
| eval HARVEST_TO_PUBLISH=round(HARVEST_DATE_TIMESTAMP-PUBLISH_DATE_TIMESTAMP, 1) 
| stats avg(HARVEST_TO_PUBLISH) as HARVEST_PUBLISH_AVG by MEDIA_TYPE
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Long shot, try something like this

 index=sfmc-ris TYPE=PUSH_POST_ASYNC SUCCESS=true| eval PUBLISH_DATE_TIMESTAMP=strptime(PUBLISH_DATE, "%Y-%m-%d %H:%M:%S,%3Q") | eval HARVEST_DATE_TIMESTAMP=strptime(HARVEST_DATE, "%Y-%m-%d %H:%M:%S,%3Q") | eval HARVEST_TO_PUBLISH=round(HARVEST_DATE_TIMESTAMP-PUBLISH_DATE_TIMESTAMP, 1) | stats count sum(HARVEST_TO_PUBLISH) as HARVEST_PUBLISH by MEDIA_TYPE | eval HARVEST_PUBLISH_AVG=HARVEST_PUBLISH /count | table MEDIA_TYPE HARVEST_PUBLISH_AVG
0 Karma

rjuliani
New Member

Thanks @somesoni2 for your suggestion. I tried it, along with some variations of the same formula, and nothing. The moment I do "by MEDIA_TYPE" the results get lost and splunk will start showing empty values.

Any other suggestion is much appreciated!

0 Karma

maciep
Champion

if you remove the stats all together, does your result set have a field called MEDIA_TYPE? All uppercase with an underscore? It would seem just off hand that the field doesn't exist, so the by clause is null, so you don't see any results.

rjuliani
New Member

Yes, my result set has MEDIA_TYPE. In fact, If I do a simple count(MEDIA_TYPE) it works fine. other more complex operations with it also work and the results are correct and as expected.

0 Karma

maciep
Champion

can you group your results by any other field?

0 Karma

rjuliani
New Member

If I use any other field, it gives me the same kind of result 😞

0 Karma

maciep
Champion

I don't suppose you're in fast mode (instead of say smart mode)? I mean, from everything else you said I'm guessing not, but thought it worth asking.

If I were you, I'd start narrowing down where the problem might be. For example, can you chart/timechart/table other fields. So is the problem with stats, or any command or a subset of commands etc.

Also, maybe go back to the beginning of the search after the first pipe and just try to do a stats count by MEDIA_TYPE (or whatever). Does that work? If so, add more of the search until it doesn't work again.

Kind of hard to go through all of those troubleshooting permutations here, but I think you get the idea.

0 Karma

rjuliani
New Member

@maciep, I'm not in Fast Mode, I'm afraid.

I can indeed chart any other fields, and even the derived field, unless I use a BY condition. The problem is not the MEDIA_TYPE field, but how splunk reacts when I try to use the BY statement.

Trying to debug as we speak 😞

0 Karma

jkat54
SplunkTrust
SplunkTrust

I agree with MacieP, you must not have a field called "MEDIA_TYPE" in the data pipe to the stats command.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...