Getting Data In

How to update a lookup with a scheduled search by appending new data or editing existing rows, not replace the entire lookup?

Cuyose
Builder

I can't seem to find this scenario which is odd. Basically I want to update a list of usernames. I want to run an initial search over a time frame to create the initial lookup. I then want to run a scheduled search to update this lookup, either by appending new data or editing the existing rows in the lookup based on a column being the primary key (userid).

I DO NOT WANT the entire CSV replaced, or duplicate userid's, which seems to be what all the documentation says. Does Splunk support this?

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi Cuyose,

Yes, you can do such a thing in Splunk. Here is an example that will add all hosts found within the last last hour into a lookup file:

index=_internal earliest=-1h 
| eval server_name=host 
| table server_name 
| append [ inputcsv server_down ] 
| dedup server_name 
| outputcsv server_down

The second example uses the same lookup file and removes all hosts found in the events within the last and the lookup file:

index=_internal earliest=-1h 
| eval server_name=host 
| table server_name 
| append [ inputcsv server_down ] 
| stats count by server_name 
| eval server_name=case(count = 2,null() , count=1,server_name) 
| table server_name 
| outputcsv server_down

To test it and to verify use the first search and use the 2nd line as | eval server_name="foo" to add some dummy hosts.

Hope this helps ...

cheers, MuS

0 Karma

Cuyose
Builder

Since the subsearch is going to be run on a much more limited time frame I will be good.

The issue I am currently having is that I can't re-write the UPDATED value back after doing the latest(UPDATED) to filter out the dupes.

0 Karma

Cuyose
Builder

I took a slightly different route. I think this should work. Testing now, but would like any input, it seems it might be able to be optimized.

abc.csv generated by

base search
| stats  latest(_time) as UPDATED by LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR 
| convert ctime(UPDATED)
| table  UPDATED LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR

Update (scheduled search)

inputlookup abc.csv
|append [search {base search}
|stats  latest(_time) as UPDATED by LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR | convert ctime(UPDATED)| table  UPDATED LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR]
|stats latest(UPDATED) as UPDATED by LOGIN LOCATION LOCATIONCOUNTRY VENDOR ROLE TYPE SUPERVISOR

So basically this is running a subsearch and adding all those rows to the current lookup csv which has an "updated" column. It then compares the UPDATED with "latests" to get rid of the duplicates.

0 Karma

MuS
SplunkTrust
SplunkTrust

Nice - just be aware off the sub search limits http://docs.splunk.com/Documentation/Splunk/6.4.2/Search/Aboutsubsearches#Subsearch_performance
But as long as you don't come close those limits and are happy with the performance, use it 😉

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...