Splunk Search

How to Summarize a count by Day over the last 30 days

_gkollias
Builder

Hi All,

I am looking for duplicate invoices, and have created a search which gives me the total list. However, I would like to summarize the data to show total counts of dupes per day over the last 30 days. Here is my initial search:

index=contract_gateway sourcetype=esb_audit svc_context_name= bp_bp_name=Invoice status=START PublishInvoice earliest=-30d@d
punct="--::._-,,{=..},{|||....|||,||.:|----|||},{||....|}"
| stats count earliest(_time) as start_time, latest(_time) as end_time,  first(svc_context_name) as svc_context_name by batch_id
| eval _time=start_time 
| eval start_time=strftime(start_time,"%m/%d/%Y %H:%M:%S")
| sort _time
| search NOT batch_id=PO81*
| where count > 1
| table start_time, count,svc_context_name, batch_id
| dedup batch_id

Is there a way to summarize this list? Maybe using the Splunk field "date_wday"?

Thank you in advance!

Tags (3)
1 Solution

sideview
SplunkTrust
SplunkTrust

Total batch_id's that show duplicates, per day, over the last 30 days. Split by svc_context_name

index=contract_gateway sourcetype=esb_audit svc_context_name= bp_bp_name=Invoice status=START PublishInvoice 
punct="--::._-,,{=..},{|||....|||,||.:|----|||},{||....|}" earliest=-30d@d
 | stats count earliest(_time) as start_time, latest(_time) as end_time,  first(svc_context_name) as svc_context_name by batch_id
 | eval _time=start_time 
 | eval start_time=strftime(start_time,"%m/%d/%Y %H:%M:%S")
 | search NOT batch_id=PO81*
 | where count > 1
 | timechart span=1d count by svc_context_name

If you want to do the same but count total duplicates across all batch_ids, we change "count" to "sum(count) as count)". and we also have to subtract one from all the counts, because if there are N total events for a batch_id, only N-1 are strictly speaking "duplicates"

index=contract_gateway sourcetype=esb_audit svc_context_name= bp_bp_name=Invoice status=START PublishInvoice 
punct="--::._-,,{=..},{|||....|||,||.:|----|||},{||....|}" earliest=-30d@d
 | stats count earliest(_time) as start_time, latest(_time) as end_time,  first(svc_context_name) as svc_context_name by batch_id
 | eval _time=start_time 
 | eval start_time=strftime(start_time,"%m/%d/%Y %H:%M:%S")
 | search NOT batch_id=PO81*
 | where count > 1
 | eval numberOfDuplicates=count-1
 | timechart span=1d sum(numberOfDuplicates) as count by svc_context_name

And going back to the "total batch_ids showing any duplicates" version, we can use date_wday instead. Here we use date_wday to make the x-axis "day of week" instead of just the 30 days spread out linearly.

index=contract_gateway sourcetype=esb_audit svc_context_name= bp_bp_name=Invoice status=START PublishInvoice 
punct="--::._-,,{=..},{|||....|||,||.:|----|||},{||....|}" earliest=-30d@d
 | stats count earliest(_time) as start_time, latest(_time) as end_time,  first(svc_context_name) as svc_context_name by batch_id
 | eval _time=start_time 
 | eval start_time=strftime(start_time,"%m/%d/%Y %H:%M:%S")
 | search NOT batch_id=PO81*
 | where count > 1
 | chart count over date_wday by svc_context_name

You probably just want timechart though. span=1d option and all the other goodies are documented here:
http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/Timechart

View solution in original post

sideview
SplunkTrust
SplunkTrust

Total batch_id's that show duplicates, per day, over the last 30 days. Split by svc_context_name

index=contract_gateway sourcetype=esb_audit svc_context_name= bp_bp_name=Invoice status=START PublishInvoice 
punct="--::._-,,{=..},{|||....|||,||.:|----|||},{||....|}" earliest=-30d@d
 | stats count earliest(_time) as start_time, latest(_time) as end_time,  first(svc_context_name) as svc_context_name by batch_id
 | eval _time=start_time 
 | eval start_time=strftime(start_time,"%m/%d/%Y %H:%M:%S")
 | search NOT batch_id=PO81*
 | where count > 1
 | timechart span=1d count by svc_context_name

If you want to do the same but count total duplicates across all batch_ids, we change "count" to "sum(count) as count)". and we also have to subtract one from all the counts, because if there are N total events for a batch_id, only N-1 are strictly speaking "duplicates"

index=contract_gateway sourcetype=esb_audit svc_context_name= bp_bp_name=Invoice status=START PublishInvoice 
punct="--::._-,,{=..},{|||....|||,||.:|----|||},{||....|}" earliest=-30d@d
 | stats count earliest(_time) as start_time, latest(_time) as end_time,  first(svc_context_name) as svc_context_name by batch_id
 | eval _time=start_time 
 | eval start_time=strftime(start_time,"%m/%d/%Y %H:%M:%S")
 | search NOT batch_id=PO81*
 | where count > 1
 | eval numberOfDuplicates=count-1
 | timechart span=1d sum(numberOfDuplicates) as count by svc_context_name

And going back to the "total batch_ids showing any duplicates" version, we can use date_wday instead. Here we use date_wday to make the x-axis "day of week" instead of just the 30 days spread out linearly.

index=contract_gateway sourcetype=esb_audit svc_context_name= bp_bp_name=Invoice status=START PublishInvoice 
punct="--::._-,,{=..},{|||....|||,||.:|----|||},{||....|}" earliest=-30d@d
 | stats count earliest(_time) as start_time, latest(_time) as end_time,  first(svc_context_name) as svc_context_name by batch_id
 | eval _time=start_time 
 | eval start_time=strftime(start_time,"%m/%d/%Y %H:%M:%S")
 | search NOT batch_id=PO81*
 | where count > 1
 | chart count over date_wday by svc_context_name

You probably just want timechart though. span=1d option and all the other goodies are documented here:
http://docs.splunk.com/Documentation/Splunk/6.1.3/SearchReference/Timechart

_gkollias
Builder

Thank you!

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...