Knowledge Management

Any better way to compare the events being added to Splunk with a list of values which contains all the possible values and find the amount of occurence of each value ?

echozero39
Engager

I have a list of values in a .xls file, hundreds values
and a huge number of events (millions) that have been added in Splunk, having the field name "Exec" whose values must be found in the list.
However, not all values in the list can be found in the events

Now, I want a simple way to calculate the amount of occurrence for each value.

Here is what I thought :

use:

| stats count by Exec 
or
| top limit=0 Exec

And compare the results manually with the list, for those who can't be find in the events, I input 0 in the list.

use:

| eval executable=case(Exec==value1,"value1, Exec==value2, "value2", Exec==value3,"value3"...) | stats count by executable

This may get the results directly but I need to entry the command for hundreds of values and I don't even know if this is efficient.

So, can I anyone give me some advice or direction to make things easier ?
Thanks in advance for anyone who can assist.

Tags (2)
0 Karma
1 Solution

woodcock
Esteemed Legend

Put the file in $SPLUNK_HOME/var/run/splunk/ACMS_lookup.csv and try this:

inputcsv ACMS_lookup.csv | eval NomTrans = coalesce(ACMS_nomTrans,NomTrans) | append [search index=* sourcetype="*" eventtype="ACMS"] | stats count BY NomTrans | eval count = (count - 1)

View solution in original post

0 Karma

woodcock
Esteemed Legend

Put the file in $SPLUNK_HOME/var/run/splunk/ACMS_lookup.csv and try this:

inputcsv ACMS_lookup.csv | eval NomTrans = coalesce(ACMS_nomTrans,NomTrans) | append [search index=* sourcetype="*" eventtype="ACMS"] | stats count BY NomTrans | eval count = (count - 1)
0 Karma

echozero39
Engager

For a quick try I think that works generally with some contraints.

Can It be unlimit cause the maximum defaut output of append is 50000

And I think you forgot append [ search ...]

0 Karma

woodcock
Esteemed Legend

I re-edited my answer to fix the [search ...] mistake (sorry about that). You cannot eliminate the limit but...

You can dedup the subsearch to eliminate duplicates to make sure one of each gets through like this:

inputcsv ACMS_lookup.csv | eval NomTrans = coalesce(ACMS_nomTrans,NomTrans) | append [search index=* sourcetype="*" eventtype="ACMS" | dedup NomTrans] | stats count BY NomTrans | eval count = (count - 1)

If the inputcsv is always < the limit, you can swap the order of the 2 searches like this:

 index=* sourcetype="*" eventtype="ACMS" | append [search inputcsv ACMS_lookup.csv | eval NomTrans = coalesce(ACMS_nomTrans,NomTrans)] | stats count BY NomTrans | eval count = (count - 1)
0 Karma

echozero39
Engager

That's exactly what I thought and I tried this one :

index=* sourcetype="*" eventtype="ACMS" | append [inputcsv ACMS_lookup.csv | eval NomTrans = coalesce(ACMS_nomTrans,NomTrans)] | stats count BY NomTrans | eval count = (count - 1)

We don't need append [ search...] this time 😜

Thanks a lot for this question. I m gonna try the answer above to see if i can get the right command, ty again.

0 Karma

woodcock
Esteemed Legend

The 2nd option really doesn't make sense because the inputscsv source should be the bigger one, provided we dedup the other source.

0 Karma

echozero39
Engager

I am not sure if lookup is suitable for this case.
The traditional lookupis used for something like : I request a value and the lookup table return me a corresponding value. For example, http code and the meaning of codes

But in this case, I want to compare my reference list and the events from splunk, that means :

when I run a command like | stats count by NomTrans, It returns me :

    NomTrans  count

        name1, 30
        name2, 21
        name3, 1
        name6, 3

And I have a reference list of all possible values of nomTrans like :

    NomTrans
    name1
    name2
    name3
    name4
    name5
    name6

The result that I expect is a table like :

NomTrans, count
name1, 30
name2, 21
name3, 1
name4, (null) or 0
name5, (null) or 0
name6, 3

So that I can easily know if those nomTrans are used and traced and find out those who were not used.
That's my primary need.

Thanks for all who can assist.
If this cannot be done in Splunk, I suppose that I should write a program to do this sort of things.

0 Karma

jacobwilkins
Communicator

You need to take your list of executables and create a lookup table in Splunk.

Structure your lookup to have 2 columns:

Exec,in_lookup
value1,true
value2,true

The "in_lookup" will be true for each row.

Now, you could do this with the normal lookup command, but there is another pattern I usually use for this sort of thing.

| stats count by Exec 
| inputlookup append=t your_exec_lookup OUTPUT in_lookup
| fillnull value=false in_lookup
| fillnull value=0 count
| stats values(*) AS * by Exec

That sets up your dataset.

To find things in your list for which there have not been any observed events:

| search count=0 in_lookup=true

To find things that are executing, but not in your list:

| search count>0 in_lookup=false

FWIW, the values(*) AS * by somefield pattern is how the cool kids do joins in splunk.

0 Karma

echozero39
Engager

Hi, I got an error : Error in 'inputlookup' command: Invalid argument: 'OUTPUT'
So how to use a lookup table for this sort of thing ?
Thanks

0 Karma

jacobwilkins
Communicator

Oh. Whoops. That line should be:

| inputlookup append=t your_exec_lookup 
0 Karma

echozero39
Engager

And I got error with this line : | stats values(*) AS * by NomTrans

It says : Error in 'stats' command: The argument 'value(*)' is invalid

Thank you in advance for more détails...

Here is my whole command :
index=* sourcetype="*" eventtype="ACMS" | stats count by NomTrans | inputlookup append=t ACMS_lookup | fillnull value=false in_lookup | fillnull value=0 count | stats value(*) AS * by NomTrans | search count=0 in_lookup=true

0 Karma

woodcock
Esteemed Legend

It is values with an s on the end, not value. In any case, I do no think this answer is what you are looking for, did you try my latest answer (the last one)?

0 Karma

echozero39
Engager

This is somehow another way to resolve my problem, but it does not work as I wish,
In fact, I get all datas with count>0 and in_lookup=false, which means the set of data is unsuccessful.

Anyway, I just think this solution is interesting, and I am going to try your answer.

0 Karma

echozero39
Engager

Thanks for your answers, but I got something wrong.
Sry I am new to Splunk.

First of all, I created a csv file named, ACMS_lookup that have 2 columns

 ACMS_nomTrans,in_lookup
    name1,true
    name2,true

then I put this csv file in .\Splunk\etc\apps\search\lookups

And I created a transforms.conf in .\Splunk\etc\apps\search\local, where I wrote:

[ACMS_lookup]
filename = ACMS_lookup.csv

And I restarted Splunk, i thought the preparation was done.

Then I tried normal lookup but it not worked

index=* sourcetype="*" eventtype="ACMS"| stats count by NomTrans | lookup ACMS_lookup ACMS_nomTrans

And I had wrong message : Error in 'lookup' command: The lookup table 'ACMS_lookup' does not exist.

When I tried your pattern, i entried :

index=* sourcetype="*" eventtype="ACMS"| stats count by NomTrans | inputlookup append=t ACMS_lookup OUTPUT in_lookup

I got this, Error in 'inputlookup' command: Invalid argument: 'OUTPUT'

Could you still help me with that ?

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...