Splunk Search

Stats output into timechart

thisissplunk
Builder

Hello,

I've seen similar posts but they do not answer this question. What I'm trying to do is take the Statistics number received from a stats command and chart it out with timechart.

My search before the timechart:

index=network sourcetype=snort msg="Trojan*" | stats count first(_time) by host, src_ip, dest_ip, msg

This returns 10,000 rows (statistics number) instead of 80,000 events. How do I use that data and chart the results over a timechart? There should be 10,000 total events on the timechart, not 80,000. I can't figure it out.

In other words, the results (rows) of the stats command are in reality new events to me. I deduped 80,000 down to 10,000 and now I want to see one line on a timechart represent how many new "events" there are over time.

Tags (3)
0 Karma
1 Solution

thisissplunk
Builder

Thank you all for the pointers. After a short walk I realized that in essence, I was using the stats command to dedup the results. After this realization I exchanged stats with dedup and used that in timechart. This was successful.

This is what I ended up using for reference:

index=network sourcetype=snort name="Trojan*" | dedup host src_ip dest_ip msg | timechart span=1d count

This was a case of always using stats for multiple purposes which complicated the timechart. The "combined =" solution below was on par with what I was trying to do.

View solution in original post

thisissplunk
Builder

Thank you all for the pointers. After a short walk I realized that in essence, I was using the stats command to dedup the results. After this realization I exchanged stats with dedup and used that in timechart. This was successful.

This is what I ended up using for reference:

index=network sourcetype=snort name="Trojan*" | dedup host src_ip dest_ip msg | timechart span=1d count

This was a case of always using stats for multiple purposes which complicated the timechart. The "combined =" solution below was on par with what I was trying to do.

martin_mueller
SplunkTrust
SplunkTrust

Here's what you actually use:

index=network sourcetype=snort msg="Trojan*" | eval combined = host."_".src_ip."_".dest_ip."_".msg | timechart count by combined

That way you get your grouping by all four values and timechart is happy.

martin_mueller
SplunkTrust
SplunkTrust

Oh yeah, 10000 serieseses isn't going to be fun. Alternative search:

 index=network sourcetype=snort msg="Trojan*" | bucket _time | stats count by _time, host, src_ip, dest_ip, msg

That'll create a huge bunch of rows rather than 10000 fields.

acharlieh
Influencer

Also you may be interested in the limit and useother parameters on timechart. 10,000 different series of data is a lot to display on a graph, so you wind up with the top 10 series by default, and the other 9990 series are summed together in a field called "OTHER".

Ultimately what is it that you want to show in your chart?

thisissplunk
Builder

What I'm trying to show is the amount of rows per hour or so. "100 rows hit at 1 pm, 93 hit at 2 pm" etc.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Are you possibly looking for a | timechart span=1h dc(combined) as "Number of rows hit"?

thisissplunk
Builder

I'll try my best to explain, but maybe timechart is overkill. I want to simply chop up the RESULTS from the stats command by hour/day. I want to count how many unique rows I see in the stats output fall into each hour, by day. In other words, I want one line on the timechart to represent the AMOUNT of rows seen per hour/day of the STATS output (the rows). There should be a total of 10,000 events on the timechart, not 80,000, because 10,000 was returned by the stats command.

Imagine a line in front of you. At any hour, it should tell you how many times there was a unique rows from the stat command in that time frame.

The combined option you showed above is producing a line for each and every unique row form the stats command. This isn't what I want.

0 Karma

acharlieh
Influencer

This is slightly off topic, but something that may find interesting is the sparkline: you can have a would give you a mini graph on each event to show how that event trends over your search window.

index=network sourcetype=snort msg="Trojan*" | stats count sparkline by host, src_ip, dest_ip, msg

But also did you try martin's suggestion of:

index=network sourcetype=snort msg="Trojan*" | eval combined = host."_".src_ip."_".dest_ip."_".msg | timechart span=1h dc(combined) as "Number of rows hit"

That should be a single line giving the number of distinct combinations per hour.

martin_mueller
SplunkTrust
SplunkTrust

That sounds a lot like the | bucket _time | stats ... by _time ... approach.

jtrucks
Splunk Employee
Splunk Employee

You can't do multiple fields after a by clause for a timechart.

Also, your stats loses all time related values, so you can't ... | timechart since there is no longer time data.

If you are looking to aggregate over a timeframe, say, per hour, then you could try something like

index=network sourcetype=snort msg="Trojan*" | stats count by date_hour, host, src_ip, dest_ip, msg

This gives you a chart with the hours along the bottom.

If you need a true timechart effect, then try something more like this:

index=network sourcetype=snort msg="Trojan*" | stats count by _time, host, src_ip, dest_ip, msg

Your output will be different than when not counting by unique timestamp of the index event.

--
Jesse Trucks
Minister of Magic

acharlieh
Influencer

So a different note here, ( Reference ) date_hour is only present if the timestamp exists and is extracted from the raw event itself. It's also in the timezone of the log and not the timezone of you the splunk user (_time is the UTC timestamp, and is by default displayed in your timezone)... which could be important if you are combining logs from different timezones or across days. You could also use |eval _time=relative_time(_time,"@h"), or |bin _time span=1h or |eval hour=strftime(_time, "%H") for getting a field by hour based on the _time field.

jtrucks
Splunk Employee
Splunk Employee

I removed the original answer and replaced it.

--
Jesse Trucks
Minister of Magic

thisissplunk
Builder

It doesn't work. The problem seems to be that timechart cannot accept more than one value after the BY clause. This is what led me to crafting a stats statement first.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...