Hello.
I'm trying to create a search that averages a sum of payments and counts the total number of days for all time(starts at the first event and keeps going until the last). The problem is the second part. My current search only returns how many events took place on each day, which gives me huge numbers. I'm only looking for the total number of each day of the week over a time period.
This is what I have right now:
base search | rex field=_raw "(?P<paymentAmount>\w+) date_wday: (?P<time>\d+)ms" | eval date_wday=strftime(_time,"%w-%A")| stats sum(paymentAmount), count(date_wday) by date_wday| eval date_wday=replace(date_wday,"(\d+-)(\w+)","\2")
Any help would be much appreciated.
First of all: beware the built-in date_wday
(I see you are making your own, which is VERY GOOD). In most Splunk datasets the date_*
fields almost always exist and because of this you may not notice that sometimes they don’t! Most people naturally assume these fields are default fields (like _time
) that are always there but these fields are actually pre-TZ-normalization side-effects from Splunk’s timestamping process. If Splunk does not (have to) parse an event to set its timestamp then the date_*
fields will not exist. Windows event logs, for example, now come in via a modular input that is designed to use the pre-parsed time as it comes from the Windows event log APIs, obviating the need for Splunk to do any timestamp parsing, therefore the date_*
fields are not created, do not exist, and are unavailable for our use. Did you catch that they are pre-TZ-normalization? That means that if you modified the TZ, then these fields DID NOT GET MODIFIED and ARE NOT CORRECT.
Anyway, your search is pretty much OK if you are trying to group all Fridays together and all Saturdays together, etc:
base search | rex "(?<paymentAmount>\w+) date_wday: (?<time>\d+)ms" | eval date_wday=strftime(_time,"%A") | stats sum(paymentAmount), count BY date_wday
But, another way to read your post is that you'd like a summary for each day, which is different:
base search | rex "(?<paymentAmount>\w+) date_wday: (?<time>\d+)ms" | bucket _time span=1d | stats sum(paymentAmount), count BY _time
Is there a reason that you are capturing field time
but not using it?
Hello! As woodcock is saying, you are not using the time you have extracted. Also, i think you don't need the by clause herere. I propose that you opdate your query like this:
base search | rex field=_raw "(?P<paymentAmount>\w+) date_wday: (?P<time>\d+)ms" | eval date_wday=strftime(time,"%w-%A")| stats sum(paymentAmount), count(date_wday) | eval date_wday=replace(date_wday,"(\d+-)(\w+)","\2")
Thanks
I tried this out using the code you posted and the code I posted in reponse to woodcock's comment. The search returned zero as the value for count. Is there a reason you wanted me to remove the by clause? I'm fairly new to Splunk, but I think I'll need it if I want the total number of days for each day of the week.
Sorry, but i don't understand when you say, you want the total number of days for each day of the week and that is why i'v said you don't need the by date_wday
clause here. I'm really sorry.
Thanks
If I wanted data from last May, I would want Splunk to tell me that there were 4 Mondays , 4 Tuesdays, 4 Wednesdays, 4 Thursdays, 5 Fridays, 5 Saturdays, and 5 Sundays in that month. Does that clear things up?
wo wo wo. Thanks. That is verry clear now. Sorry
Thanks again!
First of all: beware the built-in date_wday
(I see you are making your own, which is VERY GOOD). In most Splunk datasets the date_*
fields almost always exist and because of this you may not notice that sometimes they don’t! Most people naturally assume these fields are default fields (like _time
) that are always there but these fields are actually pre-TZ-normalization side-effects from Splunk’s timestamping process. If Splunk does not (have to) parse an event to set its timestamp then the date_*
fields will not exist. Windows event logs, for example, now come in via a modular input that is designed to use the pre-parsed time as it comes from the Windows event log APIs, obviating the need for Splunk to do any timestamp parsing, therefore the date_*
fields are not created, do not exist, and are unavailable for our use. Did you catch that they are pre-TZ-normalization? That means that if you modified the TZ, then these fields DID NOT GET MODIFIED and ARE NOT CORRECT.
Anyway, your search is pretty much OK if you are trying to group all Fridays together and all Saturdays together, etc:
base search | rex "(?<paymentAmount>\w+) date_wday: (?<time>\d+)ms" | eval date_wday=strftime(_time,"%A") | stats sum(paymentAmount), count BY date_wday
But, another way to read your post is that you'd like a summary for each day, which is different:
base search | rex "(?<paymentAmount>\w+) date_wday: (?<time>\d+)ms" | bucket _time span=1d | stats sum(paymentAmount), count BY _time
Is there a reason that you are capturing field time
but not using it?
Yes, I'm trying to group all Fridays, Saturdays, Mondays, etc. together. I still think the search is returning the wrong results, however. Right now it is telling me that there are only two Sundays, which isn't true. The events go back to January, so there should be roughly 25 or so Sundays. Another issue is that I actually overlooked the fact that there was already a date_wday field for the reported evetns. I'm not sure if it would cause some kind of overlap issue, so I changed it to this:
base search | rex "(?<paymentAmount>\w+) days: (?<time>\d+)ms" | eval days=strftime(_time,"%A") | stats sum(paymentAmount), count BY days
No, there isn't a reason I captured the time field. Should I put something else in its place?
It is counting ONLY those Sundays which have events so the search is correct. If you'd like to count all Sundays, then (first say so) and then do this:
base search | rex "(?<paymentAmount>\w+) date_wday: (?<time>\d+)ms" | timechart span=1d sum(paymentAmount) AS paymentAmount count | eval date_wday=strftime(_time,"%A") | stats sum(paymentAmount), sum(count) AS count BY date_wday
The difference is that timechart
creates 0
-value rows but stats
does not.
Sorry to keep bothering you, but could you give an example of how I would tell Splunk to count all Sundays (the step I need to do before using the code you posted)?
The code that I just posted is the complete answer and already does that; the explanation below it was bonus.
It worked. Thank you!