Splunk Search

keping only most recent events for a fixed field

wsw70
Communicator

Hello,

I loaded vulnerability scans results into splunk and I am trying to visualize information consistently. The problem is that a typical scan covers, say, 70% of computers (the other 30% being offline, away, etc.). Over a few scans I end up with repeated results for the majority of machines.
I am therefore looking for a way to keep only the most recent ones.

My data is organized as:

timestamp2,machine1,info1
timestamp2,machine1,info2
timestamp1,machine1,info3
timestamp1,machine2,info4

timestamp2 is the most recent. I know that a machine is scanned at most once every 2 weeks.

I am therefore trying to implement the following search:

  • for each unique machine name
    • find the latest timestamp
    • remove all data for this machine which is older than 2 weeks

In the case above this would lead to the followings events being retained:

timestamp2,machine1,info1
timestamp2,machine1,info2
timestamp1,machine2,info4

I could then end up with a status as current as possible, even if the data comes from different periods.

Being new to splunk (this is an amazing tool with an exotic learning curve) I wonder where to start from and if this is even possible to do.

Thank you.

0 Karma
1 Solution

kristian_kolb
Ultra Champion

Well, given the format of your log data, a search could look like (assuming you have extracted the fields as timestamp, machine and info);

<your_source_or_sourcetype> | dedup machine, info

This would return the most recent event for a unique combination of machine and info.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Dedup

hope this helps,

Kristian

View solution in original post

kristian_kolb
Ultra Champion

Well, given the format of your log data, a search could look like (assuming you have extracted the fields as timestamp, machine and info);

<your_source_or_sourcetype> | dedup machine, info

This would return the most recent event for a unique combination of machine and info.

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Dedup

hope this helps,

Kristian

responsys_cm
Builder

This doesn't solve the problem. If the dedup command was: dedup dest,nessus_id,protocol,dest_port sortby _time, then any nessus_id that was remediated or any ports that were closed between the most recent and second most recent scans would still show up.

0 Karma

wsw70
Communicator

Just tested - works great. Thank you!

0 Karma

kristian_kolb
Ultra Champion

If this solves your problem, please mark the question as 'answered' (a/o vote up). Thanks.

/k

0 Karma

wsw70
Communicator

Thank you for the pointer -- I will test it right away

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...