Splunk Search

Transaction grouping not what I expect

snowmizer
Communicator

I have a summary index that contains the following information for my Windows event logs: host, sourcetype, and count. I want to group the records so that when they are displayed they show like this:

datetime           hostname    sourcetype                     count
--------           --------    -------------------------      -----
10/27/10 9:00 am   hostname    WinEventLog:Application        9
                               WinEventLog:Security           17034
                               WinEventLog:System             3

My search looks like:

index="winos_event_summary" | transaction orig_host maxpause=59m | table _time,orig_host,orig_sourcetype,count

This shows the following table

10/27/10 9:00 am   hostname    WinEventLog:Application        17034
                               WinEventLog:Security           3
                               WinEventLog:System             9

The count field isn't being put with the proper host/sourcetype combination. When I take out the transaction statement I get:

10/27/10 9:00 am   hostname    WinEventLog:Application        9
10/27/10 9:00 am   hostname    WinEventLog:Security           17034
10/27/10 9:00 am   hostname    WinEventLog:System             3

Why does the transaction statement move the count and associate it with the wrong host/sourcetype? The count field is part of the summary record.

Thanks.

Tags (1)
1 Solution

southeringtonp
Motivator

Your question is quite confusing.

If I understand your goal correctly, the transaction command isn't really a good fit here.

transaction will combines multiple events into a single event. The totals are going to reflect that, and you are going to be dealing with individual multivalue fields for sourcetype and count. `bucket is the command you really need, and then you can do some search language tricks to make the formatting come out the way you describe.

You will need to tweak this to match your data, but consider the following approach:

index=winos_event_summary
| bucket _time span=1h
| stats sum(count) as count by _time, hostname, sourcetype
| streamstats count as lineNumber by _time, hostname
| eval timestamp=if(lineNumber==1, _time, "")
| eval hostname=if(lineNumber==1, hostname, "")
| convert ctime(timestamp)
| fields timestamp, hostname, sourcetype, count

Broken down into sections:

  • `bucket` splits your dataset into discrete time windows
  • `stats` gathers the totals for each time window. Depending on how your summarization is set up, you might need to use `sistats` instead.
  • `streamstats` gives you an incrementing counter / line number, starting over with 1 for each section. Any time the resulting line number is 1, it's the first record for that section.
  • Use `eval` to blank out the values of "timestamp" and "hostname" any time the line number is not 1 -- i.e., blank them out for everything except the first line of the section.
  • Get rid of the `fields` that you don't need.

View solution in original post

southeringtonp
Motivator

Your question is quite confusing.

If I understand your goal correctly, the transaction command isn't really a good fit here.

transaction will combines multiple events into a single event. The totals are going to reflect that, and you are going to be dealing with individual multivalue fields for sourcetype and count. `bucket is the command you really need, and then you can do some search language tricks to make the formatting come out the way you describe.

You will need to tweak this to match your data, but consider the following approach:

index=winos_event_summary
| bucket _time span=1h
| stats sum(count) as count by _time, hostname, sourcetype
| streamstats count as lineNumber by _time, hostname
| eval timestamp=if(lineNumber==1, _time, "")
| eval hostname=if(lineNumber==1, hostname, "")
| convert ctime(timestamp)
| fields timestamp, hostname, sourcetype, count

Broken down into sections:

  • `bucket` splits your dataset into discrete time windows
  • `stats` gathers the totals for each time window. Depending on how your summarization is set up, you might need to use `sistats` instead.
  • `streamstats` gives you an incrementing counter / line number, starting over with 1 for each section. Any time the resulting line number is 1, it's the first record for that section.
  • Use `eval` to blank out the values of "timestamp" and "hostname" any time the line number is not 1 -- i.e., blank them out for everything except the first line of the section.
  • Get rid of the `fields` that you don't need.

snowmizer
Communicator

This works perfectly. Thanks for the help.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...