Splunk Search

How do you find data/values in a lookup that do not exist in the logs?

AnmolKohli
Explorer

We have a lookup file that has a list of series stored in a field — TS_SERIES_ID. We want to find the count of series that don't exist in logs and we used the below query to achieve the same.

| inputlookup tss_usage_csv | table TS_SERIES_ID
| search NOT [search index=web_timeseries  | mvexpand SeriesUT.series{}  | fields SeriesUT.series{} |rename SeriesUT.series{} as TS_SERIES_ID] | stats distinct_count(TS_SERIES_ID)

Issue : The results are getting truncated because we cannot have more than 10K results from the subsearch (We need this value to be around 300K and maximum can be set to 10500 in limits.conf).

Can you please let us know if there is any other way to achieve this?

Thanks

Tags (1)
0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

@AnmolKohli

It should work. Can you please try below search?

index=web_timeseries 
| mvexpand SeriesUT.series{} 
| fields SeriesUT.series{} 
| rename SeriesUT.series{} as TS_SERIES_ID 
| eval temp2=1
| append 
[ 
 | inputlookup tss_usage_csv 
 | table TS_SERIES_ID 
 | eval a=1 | accum a | eval subset=a%50000 | stats values(TS_SERIES_ID) as TS_SERIES_ID by subset
 | eval temp1=1  
]
| stats values(temp1) as temp1 values(temp2) as temp2 by TS_SERIES_ID 
| where isnull(temp1) 
| stats count(TS_SERIES_ID) as count

Can you please let me know how you are comparing your data for verification?

0 Karma

AnmolKohli
Explorer

The second query worked fine. Testing on different time ranges now to make sure the same is working as expected 🙂

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

Great.. Finally...

Just let me know when you finished.

0 Karma

AnmolKohli
Explorer

Running the query for last 7 days - output should be 352828 but using above query we are getting 352912 results. Manually picked 2 -3 series and they are getting reported in our query even though they have been accessed in last 7 days. Can you please help check?

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

can you please share your search?

0 Karma

AnmolKohli
Explorer

The query runs for 2 minutes and displays the correct results - 200K results but at the very last second the results drop to 24.

Results should be - (Subtract the values from below 2 queries)

index=web_timeseries
| mvexpand SeriesUT.series{}
| fields SeriesUT.series{}
| rename SeriesUT.series{} as TS_SERIES_ID
| stats distinct_count(TS_SERIES_ID) as count

| inputlookup tss_usage_csv
| table TS_SERIES_ID
| stats distinct_count(TS_SERIES_ID) as count

0 Karma

AnmolKohli
Explorer

Also I get errors when running the query -

Subsearch produced 50000 results,truncating to maxout

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

Can you please do the minor change in search?

OLD

| eval subset=a%50000 

NEW

| eval subset=a%49500 
0 Karma

AnmolKohli
Explorer

Still getting 24 as output.The result changes at the very last second.

0 Karma

kamlesh_vaghela
SplunkTrust
SplunkTrust

Can you please try with below scenarios?

1) remove below condition.

 | where isnull(temp1) 

2) update condition.

old : | where isnull(temp1)

new: | where isnull(temp2)

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...