Splunk Search

Filtering data based on results of another sub-query

skribble5
Explorer

Hi team,

I have a query about sub-queries. I've searched this forum for a while and tried a few different things but nothing seems to give me the desired result.

I am doing analysis on data within a csv file. Eventually, I need to calculate the average of a column in this csv but only on SOME records. I have a sub-query which tells me all the users (field name is "user_only") whose records I want to exclude from the average calculation. The sub-query is also on the same csv file.

So, what I need is something that does: where user_only is NOT IN (...a list of alphanumeric identifiers)....

Here is a screenshot showing my current code, shows where in the code my sub-query is and also shows separately that the sub-query does give some results:

alt text

Here is the current search query. Would appreciate any help you can provide!

| inputlookup uao0nqok.csv 
| where read_seconds > 0
| eval campaign_delivery = substr(custom_type,len(custom_type)-7,8)
| eval user_only = substr(custom_type,0,17)  
| where user_only NOT in 
       [| inputlookup uao0nqok.csv 
        | where read_seconds > 0
        | eval user_only = substr(custom_type,0,17)  
        | eval first_char = substr(custom_type,0,1)
        | where first_char = "X" or first_char = "Y"  
        | eval goodTiming = if(read_seconds != "20", 1, 0)
        | stats max(goodTiming) as goodTiming sum(read_seconds) as read_seconds count as count by user_only
        | search goodTiming=0
        | fields user_only ]

| lookup "ERT_Campaign_mapping.csv" "Campaign Subgroup ID" as campaign_delivery OUTPUT "Campaign Subgroup Name", "Email Sent Date"
| stats sum(read_seconds) as read_seconds sum(count) as count_records by "Campaign Subgroup Name", campaign_delivery, "Email Sent Date"
| eval avg = read_seconds / count_records
0 Karma
1 Solution

knielsen
Contributor

Did you try replacing

| where user_only NOT in 

with

| search NOT

?

View solution in original post

0 Karma

lakshman239
Influencer

In your original search | inputlookup uao0nqok.csv
| where read_seconds > 0 | table read_seconds, count_records
do you have value expected values? . The search NOT is only excluding the users in exclude.csv to that of original file. So, all the columns in your original search/CSV should be available for you.

0 Karma

knielsen
Contributor

Did you try replacing

| where user_only NOT in 

with

| search NOT

?

0 Karma

skribble5
Explorer

Hi @knielsen. Thanks for responding. I now tried "search NOT" and got some interesting result. First of all, I got a result, which is progress (unfortunately I can't attach a screenshot here but will explain it).

I get result with these fields:
-Campaign Subgroup Name = looks OK
-campaign_delivery = looks OK
-Email sent date = looks OK
-read_seconds = looks OK
-count_records = this column is visible but with all blank values. No good
-avg = this column is expected but is not visible.

I believe as "count_records" is blank, my calculation of avg (which is read_seconds divided by count_records) is not coming through.

Any idea why that is the case? THis is my query now - i have moved sub-search to a different csv file. Note - the new csv file only has IDs (field name = user_only) but uao0nqok.csv has all other fields.

| inputlookup uao0nqok.csv 
| where read_seconds > 0
| eval campaign_delivery = substr(custom_type,len(custom_type)-7,8)
| eval user_only = substr(custom_type,0,17)  
| search NOT [inputlookup uao0nqok_exclude.csv ]

| lookup "ERT_Campaign_mapping.csv" "Campaign Subgroup ID" as campaign_delivery OUTPUT "Campaign Subgroup Name", "Email Sent Date"
| stats sum(read_seconds) as read_seconds sum(count) as count_records by "Campaign Subgroup Name", campaign_delivery, "Email Sent Date"
| eval avg = read_seconds / count_records
0 Karma

skribble5
Explorer

I have since simplified the average calculation and now I'm happy with the exclusion and results. Thanks!

0 Karma

lakshman239
Influencer

One option would be to use 2 lookups
- | inputlookup uao0nqok.csv | put your search here to exclude your users | outputlookup uao0nqok_updated.csv . Then you can run your first search against this new lookup

would this be ok?

skribble5
Explorer

Thanks @lakshman239 , I can try it. I now have the exclusion ids in a new file (just ids not full records) but I still don't know the best way to say exclude the records for these user ids. Don't think my NOT in syntax is correct:

| inputlookup uao0nqok.csv 
| where read_seconds > 0
| eval campaign_delivery = substr(custom_type,len(custom_type)-7,8)
| eval user_only = substr(custom_type,0,17)  
| where user_only NOT in [inputlookup uao0nqok_exclude.csv ]

| lookup "ERT_Campaign_mapping.csv" "Campaign Subgroup ID" as campaign_delivery OUTPUT "Campaign Subgroup Name", "Email Sent Date"
| stats sum(read_seconds) as read_seconds sum(count) as count_records by "Campaign Subgroup Name", campaign_delivery, "Email Sent Date"
| eval avg = read_seconds / count_records
0 Karma

lakshman239
Influencer

Pls change where to search
| search NOT [|inputlookup uao0nqok_exclude.csv |fields user_only ]

skribble5
Explorer

Ok, my initial average query was a bit complicated when I was trying with sub-queries. Now, that I have moved that calculation outside (via outputlookup), I have simplified the main query and now I'm happy with the results. Thank you very much!

0 Karma

skribble5
Explorer

Ok just tried that and I think we are getting close. I am now getting a result but now a couple of columns are missing.

To calculate average, I need 2 fields: read_seconds and count_records. I see value in read_seconds and that looks right. I see a column count_records but it has blank values. Because of this, my "eval avg = read_seconds / count_records" isn't appearing at all in result.

Can you see why my count_records is blank?

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us in this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...