Splunk Search

How to search the difference between the stats output per field and fill "0" values in the stats output if an event does not exist?

inbroker
New Member

I have two source files, SourceA and SourceB, representing different months e.g. logs from June and July

Each source contains logs, and among other fields, it has:
ip, log_level, month

Doing a stats count by ip, log_level per source, I get results like:

(sourceA)

IP Address Log_Level Count

    ipA                 0                   X1
    ipA                 1                   X2
    ipB                 0                   X3
    ipC                 0                   X4
    ipC                 1                   X5

(sourceB)

IP Address Log_Level Count

    ipA                 0                   Y1
    ipA                 1                   Y2
    ipB                 0                   Y3
    ipB                 1                   Y4
    ipC                 0                   Y5

Note that it may happen that a certain IP doesn't have logs of a certain log_level on some months.

I would like to compare the difference between the two sources something like

(result)

IP Address Log_Level (current-previous count)

    ipA                 0                   Y1 - X1
    ipA                 1                   Y2 - X2
    ipB                 0                   Y3 - X3
    **ipB                 1                   Y4 - 0**
    ipC                 0                   Y5 - X4
    **ipC                 1                    0 - X5**

Notice that I want to calculate the difference per IP and per log_level (so I need to 'introduce' the count value of 0 when such events are non-existing.

Appendcols didn't work because it combines the resulting rows in sequential order so it mixes e.g. (ipC,0) from source A with (ipB,1) from source B

Append didn't work because it just produces a table with all the rows from both sources like

IP Address Log_Level Count

    ipA                 0                   X1
    ipA                 1                   X2
    ipB                 0                   X3
    ipC                 0                   X4
    ipC                 1                   X5
    ipA                 0                   Y1
    ipA                 1                   Y2
    ipB                 0                   Y3
    ipB                 1                   Y4
    ipC                 0                   Y5

Join seems to be the best candidate, but both left and outer joins filter out the rows of each source that doesn't have a corresponding one on the other table, so in the end result, I am getting something like:

IP Address Log_Level (current-previous count)

    ipA                 0                   Y1 - X1
    ipA                 1                   Y2 - X2
    ipB                 0                   Y3 - X3
    **ipB                 1                   Y4 - 0** (outer) or  **ipC                 1                    0 - X5** (left)
    ipC                 0                   Y5 - X4

Is there any way to 'inject' the '0' values in the stats output if such logs are non-existent or any other possible way?

Any hints? Thanks in advance

0 Karma
1 Solution

lguinn2
Legend

Try this

yoursearchhere (source=A OR source=B)
| stats count(eval(source="A")) as sourceA count(eval(source="B")) as sourceB by ip, log_level, month
| fillnull 
| eval difference=sourceA-sourceB

Still, this won't deliver a row if both sourceA and sourceB don't have a certain log_level for a particular month. But - no need for a subsearch!

View solution in original post

0 Karma

somesoni2
Revered Legend

Try something like this (assuming sourceA (previous month) and sourceB (current month) are hardcoded values)

your base search to include data from both sources | table source ip log_level | eval ip=ip."#".log_level | chart count over ip by source | rex field=ip "<?<ip>.*)#(?<log_level>.*)" | eval different=sourceB-sourceA | table ip log_level difference
0 Karma

lguinn2
Legend

Try this

yoursearchhere (source=A OR source=B)
| stats count(eval(source="A")) as sourceA count(eval(source="B")) as sourceB by ip, log_level, month
| fillnull 
| eval difference=sourceA-sourceB

Still, this won't deliver a row if both sourceA and sourceB don't have a certain log_level for a particular month. But - no need for a subsearch!

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...